^J4i>'o 2^- vS DEPOSITORY 1 APR 1977 NflVOL RfSEfleCH LOGISTICS oufleeiy o _ '- - ^ . — O mr- MARCH 1977 VOL. 24, NO. 1 OFFICE OF NAVAL RESEARCH NAVSO P-1278 NAVAL RESEARCH LOGISTICS QUARTERLY EDITORS Murray A. Geislcr Logistics Management Institute W. H. Marlow The George Washington University Bruce J. McDonald Office of Naval Research MANAGING EDITOR Seymour M. Selig Office of Naval Research Arlington, Virginia 22217 ASSOCIATE EDITORS Marvin Dcnicoff Office of Naval Research Alan J. Hoffman IBM Corporation Neal D. Glassman Office of Naval Research Jack Laderman Bronx, New York Thomas L. Saaty University of Pennsylvania Henry Solomon The George Washington University The Naval Research Logistics Quarterly is devoted to the dissemination of scientific information in logistics and will publish research and expository papers, including those in certain areas of mathematics, statistics, and economics, relevant to the over-all effort to improve the efficiency and effectiveness of logistics operations. Information for Contributors is indicated on inside back cover. The Naval Research Logistics Quarterly is published by the Office of Naval Research in the months of March, June, September, and December and can be purchased from the Superintendent of Documents, U.S. Government Printing Office, Washington, D.C. 20402. Subscription Price: $1 1.15 a year in the U.S. and Canada, $13.95 elsewhere. Cost of individual issues may be obtained from the Superintendent of Documents. The views and opinions expressed in this Journal are those of the authors and not necessarily those of the Office of Naval Research. Issuance of this periodical approved in accordance with Department of the Navy Publications and Printing Regulations,! P-35 (Revised 1-74). j A TWO-ECHELON INVENTORY MODEL WITH PURCHASES, DISPOSITIONS, SHIPMENTS, RETURNS AND TRANSSHIPMENTS Bruce Hoadley and Daniel P. Heyman Bell Telephone Laboratories Holmdel, New Jersey ABSTRACT This paper presents a one-period two-echelon inventory model with one warehouse in the first echelon and n warehouses in the second echelon. At the beginning of the period the stock levels at all facilities are adjusted by purchasing or disposing of items at the first echelon, returning or shipping items between the echelons and transshipping items within the second echelon. During the period, demands (which may be negative) are placed on all warehouses in the second echelon and an attempt is made to satisfy shortages either by an expedited shipment from the first echelon to the second echelon or an expedited transshipment within the second echelon. The decision problem is to choose an initial stock level at the first echelon (by a purchase or a disposition) and an initial allocation so as to mini- mize the initial stock movement costs during the period plus inventory carrying costs and system shortage costs at the end of the period. It is shown that the objective function takes on one of four forms, depending on the relative magnitudes of the various shipping costs. All four forms of the objective function are derived and proven to be convex. Several applications of this general model are considered. We also consider multi-period extensions of the general model and an important special case is solved explicitly. 1. INTRODUCTION In telephone and many military supply systems, when an item is taken out of inventory and placed with a customer, the equipment is still owned by the supplying organizations. For example, customer station apparatus (e.g., telephone sets) are owned by the operating telephone company and aircraft engines or on-board electronic equipment are owned by a branch of the military service. These items are used by the customer and then returned for possible repair and reuse by another customer. Typically, these items are supplied from a multi-echelon multi-location inventory system. Issues that must be resolved at various levels in a supply line are how much new material should be ordered or how much existing stock should be disposed, and how should the existing stock be allocated to the various echelons. This typically complex problem is further complicated by the fact that at many locations material returns from a lower echelon can exceed demands, 1 B. HOADLEY AND D. P. HEYMAN Material Management Centers Service Centers Work Locations Figure 1. — A general supply line. thus causing a negative net demand. Figure 1 is a schematic diagram of a general supply line where the boxes represent the stocking locations and the arrows represent material flows. The output from the factories must be allocated to a few very large warehouses we will call material management centers. Each material management center supplies a set of service centers or central stocks which in turn supply the work locations and receive returned material from them. Finally, the material flows from the work locations to the customers and after use the material flows back to the work locations. The kind of questions that arise in this system are : (i) How should the factory output be allocated? (ii) When and how much should one echelon order from another? (iii) Should the material be stocked at the material management center which then would supply the service centers on demand, or should some of the material be stocked at the service centers? (iv) Should the work locations keep stock, or should it be returned to the service centers? (v) When is it economical to transship material between locations in the same echelon? TWO-ECHELON INVENTORY MODEL 1.1. Outline and Motivation To study these questions, we will consider the two echelon system shown below: first echelon second echelon The location in the first echelon will be called facility and the locations in the second echelon will be called facilities 1, 2, . . ., n. Shipments from an outside supplier are received only at facility 0, and external demands and returns occur only at facilities 1,2, . . ., n. We assume that the net demands (gross demands minus returns) are independent from location to location and from time period to time period. We emphasize that net demand at a second echelon facility may be negative. Regular and expedited shipments are allowed between any pair of facilities. Since there is no external demand at facility 0, by assuming that it is always less expensive to expedite a shipment between any two second echelon facilities than to expedite shipments between them via facility zero, we do, in effect, prohibit expedited shipments from the second echelon to the first. Facility may dispose of stock but the other facilities can not. In Section 2 we present a general theory for such a system. In particular, we consider the fol- lowing decision problems. How much inventory should a) facility order or dispose ; b) facility ship to facilities 1,2, . . ., n; c) facility recall from facilities 1,2, . . ., n; d) be transshipped between second echelon facilities. We will also obtain the operating char- acteristics of the expedited shipments and transshipments. In Section 3 it is shown how the general model can be applied to three applications that moti- vated this study. They are the stock control problem in a regional material distribution system, the returns problem and the transshipment problem in a inventory management system. Multi-period extensions of the general model are discussed in Section 4 and a multi-period formulation of the stock control problem is solved. 4 B. HOADLEY AND D. P. HEYMAN 1. 2. Relation to Previous Work Clark [3] is a general survey of multi-echelon systems. Allen [1] considers transshipments among several locations (which correspond to our lower echelon) and finds an algorithm to minimize trans- shipment plus storage costs. Bessler and Veinott [2] consider a multi-period model of a multi- echelon system with regular and expedited redistributions, but they strongly restrict the flows in their model by assuming that each location can receive shipments from exactly one other location. They obtain bounds on the amount of stock on hand after the regular movements at each location. Clark and Scarf [4] consider purchasing policies for a multi-period multi-echelon system. They solved the problem when activities are arranged in series. They discuss the problem treated in this paper of activities arranged in parallel; they specifically [p. 486] exclude transshipments, claiming they are not done in practice; and state that "The theoretical and computational aspects of the problem become quite complex." Fukuda [9] extends the (activities in series) model of Clark and Scarf by allowing excess stock in each echelon to be disposed of. Gross [8] describes the special case of our single-period model where only regular transshipments are allowed and obtains an algorithm to find the optimal order quantity at each second echelon facility when there is no a priori effective bound on the number of items available. Simpson [12] includes such a bound but does not allow transshipments. Das [5] considers a two-location single-echelon single-period model where trans- shipments are allowed at a given time epoch in the period; returns are not allowed. He finds condi- tions for the cost function to be convex and for no transshipments to be optimal. The single-echelon model with expedited transshipments of Krishnan and Rao [10] is a very special case of our model. In our opinion, existing multi-echelon inventory theories often fall short of practical usefulness because they are overl}^ complicated or contain very restrictive assumptions. In this paper we attempt to develop a reasonably simple model that when used with care can help analyze some of the questions raised above. 2. GENERAL FORMULATION AND RESULTS We assume that events and costs occur in the following manner. At the start of the period, the stocks at each facility are observed. The inventory manager is then given the option of adjusting the stock levels at all facilities in the system. The adjustment can be accomplished by (i) purchasing items at facility 0, (ii) disposing of items at facility 0, (iii) returning items from the second echelon to the first echelon, (iv) shipping items from the first echelon to the second echelon, (v) transshipping between facilities in the second echelon. A per unit cost is associated with each of these transactions, and we assume that the adjustments chosen all occur instantaneously. After the new stock levels are set, the demands at each second echelon facility are realized. These demands may be negative, indicating that the number of withdrawals requested was smaller than the number of items returned. If the stock at a facility is less than the demand, the manager will attempt to satisfy the demand by making an expedited shipment from the first echelon or an expe- dited transshipment from another facility. A different holding charge for each echelon is incurred for items remaining in the system at the end of the period. The same holding charge applies to all second echelon facilities. A system shortage cost is incurred for each demand not satisfied by the system. This cost is the same for all facilities and is proportional to the number of unsatisfied demands. TWO-ECHELON INVENTORY MODEL 5 2.1. Notations and Definitions Let Xi be the stock level at facility i{i—Q, 1, . . ., n) at the beginning of the period (a negative Xi can be interpreted as a number of backorders), ?/,>0 be the stock level at facility i after the initial adjustments, and Di be the independent (random) demands at facility i(i?^0). Now define the quantities (1) IN,= (2/,-x,)+ (2) OUT,= (a;-y,)+ (3) \,= {D,-y,y for i=l, 2, . . ., n, where (a) +=max (a, 0). Then at facility i, INj is the number of items initially added the stock, OUTj is the number of items initially subtracted from the stock, and Vt is the shortage before the expedited adjustments are made. Now we consider the various types of stock movements. At the beginning of the period, let P= total number of purchases at facility 0, J= total number of dispositions (junks, sales) at facility 0, (5= total number of shipments from to the second echelon, 7^= Total number of transshipments among second echelon facilities, jR= Total number of returns from second echelon to facility 0. These movements will be called regular movements and the per unit costs of each, for all pairs of facilities, are Cp, Cj, Cs, Ct and Cr, respectively. For expedited adjustments, let (S*= total number of expedited shipments from to second echelon, 7"*= total number of expedited transshipments among facilities in the second echelon. The per unit costs of these movements, for all pairs of facilities, are C*s and C*t, respectively. Since S* and T* depend on the demands, they are random variables. At the end of the period, after the expedited movements have occurred, let Fo= system shortage (the excess of total demand over total stock), s= system shortage cost per unit short, /o= inventory at the first echelon, ^0= carrying cost per unit of /q, /=inventory at the second echelon, ^= carrying cost per unit of /. Since Vq, Iq and / depend on the demands, they are also random variables. We note that all movement costs depend only on the number of items moved and not on the distance between the facilities involved. Also there are no fixed costs in the model. Finally, we define the cost of regular movements by (4) C-RM=CpP-\-CjJ+CsS+CrT+CnR. the expected cost of expedited movements by (5) CEM=C*s-EiS*)+C*T-EiT*), 6 - B. HOADLEY AND D. P. HEYMAN the expected carrying cost by (6) CC=ho-E{Io)+h-E(I), the expected system shortage cost by (7) CSS=s-E{Vo) and the total expected cost by the sum of the four costs given above. Our problem is to choose nonnegative values of yo, yi, y^, • • ■, yn to minimize the total expected costs. 2.2. General Results for Inventory Levels and Costs To simphfy the formulas, let n i = l and Y^yo+j:yi- t=i By using a conservation argument, one easily estabUshes that (8) Io+I=^Y-±D^ (9) Vo=^±D,-yJ- We can also use conservation arguments to show (10a) i:iN,=s+r, i=l (10b) 20UT,=i?+r, and (10c) T,yi=i2xi+S-K or (11) R=i:{Xi-y,)+S. i=l Substituting (10a) and (11) in (4) we obtain (12) CUM^CAY-Xy+Cj{X-Y)++Cal2ix-yi)+Crf:{yi-xd^ + iCs+Cn-Cr)-S. i=l 1=1 Since second echelon shortages are satisfied by either S*, T* or Vq, we obtain (13) i2V,=S*+T*+Vo. i=l TWO-ECHELON INrVENTORY MODEL 7 Substituting (13) into (5) yields which, using (9), can be written as (14) C'E:M=C*TE[p^V-Vo']+(C*s-C*r)E[S*]. Since (15) Io=Y-±.y,-S*, 1=1 we can obtain from (8) (16) CC=hE[I+Io]+{ho-h)E[Io]=hE]^Y- g D^''+{h-ho) (^E[S*]+i: Vi-Y^- From (12), (14) and (16) we can derive that the objective function F can be written as F=-hB(±, D^+Cn i: x.+CjX+ih-K-Cn) f: yi+{fh-Cj)Y+Cr Zl (y.-x.)+ Lt=l J i = l t = l i = l (17) +{Cp+Cj) {Y-X)+-\-C*rE^f: (£>i-2/0+] +{h+s-C*r) ^[s D.-y'J + ih-ho+C*s-C*r)E(S*) + {Cs-\-Ca-C,)S. Equation (17) can be written more conveniently by introducing the following functions of y^ivo, yi, 2/2, . . •, yn): n (18a) fiiy)=T.yi, i=l (18b) My)=i:,iyi-Xiy, (18c) My)=[t:iyi-Xi)J, (18d) Uy)=E^±{D,-y,)+'j, (18e) fs(y)=E^±{D,-y,)J, (18f) My)-Ei^±{D,-y,)+-yo]\ (i8g) My)=±yu i = (i8h) - Uy)-={tayi-xJ , 8 B. HOADLEY AND D. P. HEYMAN (18i) My) = El±D,-±yT. Li=l i=0 J With these definitions, (17) becomes (19) F{y) = -hEl±D~\-\-C^ ±, x,+CjX+{h-ho-Oj,)My) + (ho-Cj)My) + Crf2iy) Li=l J i=l + iCp+Cj)My) + C*rUy)Mh+s-C*r)My) + {h-ho-hC%-C*r)E(S*)+{Cs-\-Ca-Cr)S. Recognizing that S and E(S*) depend on y, we are lead to a study of movement strategies. 2.3. General Results for Movement Strategies The movement strategies will obviously depend on the movement costs. We distinguish four cases which are analyzed separately. CASE 1: Ct>Cr-\-Cs In this case, it is more expensive to transship items between facilities in the second echelon than to first return them to the first echelon and then ship them to another facility in the second echelon. Hence, all the initial stock increases in the second echelon will be effected by a shipment, so (20) S=i: (y^-xd+^My)- i=l CASE 2: Cr<CR+Cs In this case shipments will only be used to satisfy those initial stock increases that cannot be satisfied by transshipments, so (21) S=^± y,-± x,J=/3(y). CASE 3: C*T>C*s In this case expedited shipments will be preferred to expedited transshipments, so as many shortages as possible will be satisfied with the stock at the first echelon. Thus, s*=Mm [±: Vu i/o]=i: ^-[s Vi-yoJ> and so (22) E(S*)=My)-My). CASE 4: C*T<C*s In this case expedited transshipments will be used to satisfy as much of the shortages as possible, so Since T*=Min [± {D-y,)\ ± (y^-D.yl- Li=l i = l J / n n \ + (j:D,-^yA \i=l i=l / is the demand that cannot be satisfied by the initial allocations or by expedited transshipments, we must have (i:Di-±yX=S*+Vo; therefore, from (9) (23) TWO-ECHELON INVENTORY MODEL E(S*)=Uy)-My)- 9 Thus, given the relationships among the movement costs, equations (20)-(23) allow us to write (19) in terms of the functions defined in (18), yielding an objective function that is expressed in terms of the decision variables and the distributions of the random demands. 2.4. The Four General Forms of the Objective Function The four comparisons between movement costs given in the last section give rise to four different cases of the general problem. They are PROBLEM I: C*t<C%, Cr>CR-\-Cs, PROBLEM II: 0% < C*s, Cr < Cj^+Cs, PROBLEM III: C*t>C*s, Ct>Cr+Cs, PROBLEM IV: C*t>C*s, Ct<Cr+Cs. Using (20) -(23) the objective function (dropping the constant) for each problem is given in Table 1 below. Table 1 . — Coefficients oj the Objective Function Problem I II III IV hiy) Ct*<Cs* Ct^Cr+Cs Ct"^ Cr-\-Cs Cr^C7«+6s C T /*Cs ^ tSi Cje~rCs My) h—ho — CR h — ho — CR h — ho — CR h — ho—CR My) Cs'\-Cr O jt Cs+Cr G T My) Cs-\-Cr — Ct Cs-\-C r — Ct My) C/ T Cr* h-ho+Cs* h-ho+Cs* My) ih lhQ~\~ i^s Kj T h-ho+Cs*-CT* My) ho-h-\-CT*-Cs* ho-h+CT*-Cs* My) K-Cj ho-Cj ho-Cj ho-Cj My) Cp+Cj Cp+Cj Cp+Cj Cp+Cj My) K+s-Cs* ho+s-Cs* h+S-Cr* h+S-Cr* 2.5. Convexity of the Objective Function We will now give some simple sufficient conditions for the objective functions shown in Table 1 to convex. First, we show tha.tfi{y), i—1, 2, . . .,9 are convex functions. THEOREM 1; The functions /^ (2/), i=l, 2, . . ., 9 are convex functions of y. 10 B, HOADLEY AND D. P. HETMAN PROOF: a) Since /i(i/) is a linear function, both —jiiy) and/i(y) are convex, b) The function (t/j— Zt)"*" is obviously a convex function of y,, 80/2(2/) is convex because it is a sum of convex func- tions, c) jz{y) is the maximum of i=l and the null function, both of which are convex, sofsiy) is convex, d) Since 1=1 is clearly convex in y for any choice of Di, i=l, 2, . . ., n,J^(y) is convex because convexity is pre- served under expectations, e) Since [p^iD-y,)J is convex for any choice of Z),, i=l, . . ., 7i,fs{y) is convex. Since i:iD,-yd^-yoY U=i I is the maximum of a convex function and the null function for any choice of Di, i=l, . . ., n, f^iy) is convex. The remainder of the proof follows in a similar manner. Q.E.D. So sufficient conditions for the objective functions to be convex are that the coefficients of all the/^t's except /i and/7 are nonnegative. A special case of interest for which the above is true is (24a) ^=^0 (24b) Cp>-Cj (24c) s>C*s (24d) s>C*t; i.e., the carrying cost factor is the same for both echelons, the purchase price is greater than the dis- position value and the system shortage cost is greater than both the expedited shipping and trans- shipment costs. If the inequalities in (24) hold, then the problem of finding a nonnegative value of y to minimize (19) is a convex programming problem. This type of mathematical programming problem has been extensively studied (see, e.g.. Eaves and Saigal [6], Fiacco and McCormick [7], Mifflin [11], and Zangwill [13]) and algorithms and computer codes are available to solve it. We should point out that the functions /t(y) are not necessarily differentiable. For example, /2(y) is not differentiable at those points where yt—Xi, i=l, . . ., n &nd fi{y) is not differentiable at those points where y,=d,ifPr{A=c/,}>0. However, the manner in which the derivative fails to exist is very simple, so algorithms that require a differentiable objective function should be adequate for this problem. For the inventory applications considered here, y must be composed of integers to be physically meaningful. Since we do not know of an algorithm that can solve such a problem, we have chosen to ignore the integer aspects of y. Thus, the usual caveats about solving integer problems as noninteger ones apply. TWO-ECHELON INVENTORY MODEL 11 3. APPLICATIONS In this section we discuss how the general model can be used in several appHcations and find the optimal solution for some of them. To keep the characterizations relatively simple, we assume that the c.d.f. of />< is of the form y 9i{x)dx y<0 (25) Fr{D,<y} I. j gi(x)dx+pi y=0 gi{x)dx+Pi-\- I Jiix)dx y>0; I.e., but otherwise Dj is a continuous random variable with a density function. This assumption allows us to model both high and low volume products. In deriving the characterizations, the following Lemmas are useful. LEMMA 1 : For a random variable D, ^E[D-yV=-Pr{D>y} ^E[D-y]^=-Pr{D>y}, where d^ and d" denote right and left derivatives respectively. PROOF: The Lemma follows directly by iaterchanging the derivative and expectation operators. Q.E.D. The second Lemma is due to Allen [1, Theorem 1, p. 339] and we will use it extensively. LEMMA 2: (Allen) Let F{y) be a real, convex differentiable function in n-dimensional Euclid- ean space. Let li{Ui) be a minimum (maximum) of the i-th coordinate j/tif t/j is bounded from below (above). Let R be the set of all points so restricted. Then a necessary and sufficient condition that a point y" maximize F over R is that either .o^Z..and^>0 y,=Ui and —^<0 is satisfied for 1=1, 2, . . ., n. Allen also describes an algorithm for finding y", so a method for obtaining numerical solutions already exists. or or 12 B. HOADLEY AND D. P. HEYMAN 3.1. The Stock Control Problem Consider a regional distribution system consisting of a material management center (MMC) (the first echelon) and several service centers (the second echelon). The stocks owned by the region are purchased through the MMC from the factory and are allocated to all facilities in the region. In the case of a product where growth is taking place at all service centers, it is appropriate to assume that Pr{Z)<>0} = l and that there is no need to return material to the MMC, transship material between service centers or to dispose of material on a regular basis; hence, we also assume that yi'>Xi, i=l, . . ., n. Also, since regular shipments are cheaper than expedited shipments, and the average distance between service centers is less than the average distance between the MMC and the service centers, we assume that C's< C*7.< C*s<s. This is a special case of either Problem I or II which are the same in this case hec&nse f2{y) =fziy) • To avoid unnecessary complication, it is also assumed that h=ho. Note that the assumption C*r< C*s amounts to assuming that when a demand cannot be satis- fied by the local servi-ce center, an expedited transshipment is made from another service center if possible. The multiperiod extension of this problem is solved in Section 4. From the general theory, the problem is to find a vector f=iyo',y^',...,yn') which minimizes (26) F:{y) = C^ (± y,-x)-i-h l± y-± E{D,)~\+ih+s-C*s)El± D,-± yJ \i=o / L!=o j=i J Li=i !=o J + Cs S {y^-x,) + C*rE^± {D,-yd+l^+(C*s-C*r)E^± (D^-ydJ, subject to yo>o yt>Xt i=l, . . . ,n (27) i:yi>x. According to Lemma 2, to find y°, we need only construct a vector for which either (28a) 3^=Oor!/o«=OandqM>o oyo oyo and for which either (28b) «^_0„r!„-=x.and«^>0 dyi '^ dyt ior i—1, . . ., n, and for which (28c) i:y%>x. i=l To construct such a vector, first note that ^^(»Oiff dyo TWO-ECHELON INVENTORY MODEL 13 and ^=(»0 i=l,...,niS (29b) TT{D,>y,} = «)G{y), where G(y)=^C^+h-(h+s-C*s) Pr {g:Z?,>g:y,}+Cs-(C*s-CMPr {Z: A> g2/i)]/c To keep the analysis relatively simple, we assume that s'^Cp-\-Cs* so that (Cp+h)/{h+s-C*s)<l, and assume that Y defined by (30a) Fr\±D,>Y]- ^"^^ T ■ i=\ J h-\-s—C*s is greater than X. Now for 0<€<1 and 1=1, . . .,7i, define X, if Pr{A>xa<« a solution to +v, • ^ Vv{D;>y,}^e otherwise, (30b) y,{e)^ and in the latter case yiie)'^Xi, and define (30c) yo{e)=^Y-±yMj, Rie)=G{yie)). Note that either Y=±yM i=0 or 2/o(€) = and Y<±yr{e); i=0 hence, t/(€) satisfies (28a) and (28c) for any e. Cp+h+Cs Now, R{Q) = Cs-{C*s-C*r) Pr I X: 0,> S X, m)= j,^ ^^^<i, and R{i) is a nonincreasing function of e; therefore, R{e) = e has a solution, e*, and y^=y{i*) satis- fies (28). 14 B. HOADLEY AND D. P. HEYMAN To interpret the results, about the only thing that can be said in general is that all second echelon facilities that receive stock will experience the same shortage probability, and if some stock is initially allocated to the first echelon, then the sj'^stem shortage probability is [Cp+h]/ih+sC*s]. The first conclusion is the same as a result of Simpson [12, Theorem 2, p. 801] and extends that result to the more general situation where transshipments and expedited movements are possible. 3.2. The Returns Problem A problem that arises when net demand at some second echelon facilities may be negative is: how many items should these facilities return to the first echelon? We shall call this the returns problem. The returns problem is the special case of Problem III where Ct^Cr-\-Cs (hence T'=0) and Cs—Cs*<CCt* (hence regular shipments will be zero). To avoid unnecessary complications, we assume that h^ho and that n i=Q is given and equal to n Thus we seek a vector y—(yi, . . .,yn) to minimize (31) 1=1 Li=l J Li=l i=l J subject to (32) 0<2/,<x„t = l,2, . . . ,n. An algorithm for solving exactly this type of problem is given in [11]. We shall restrict ourselves to characterizing the optimal solution. Since F2 is a convex function, Lemma 2 asserts that y° is an optimal point if and only if either (33a) ^^=0, 0<y.°<x, dyi or (33b) ^!^)>Oandv.°=0, dyt or (33c) ^:|W)<0and2/.°=x„ dyt holds for i=l, 2, . . ., n. r TWO-ECHELON INVENTORY MODEL By interchangiBg the derivative and expectation operators we get (34a) ^ E [S {D,-y,r-x,-±, (x<-yO J=^ ^ [g {D,-y,y-x,~± {x,-y,)J =Pr(D,<2/„i:(A-y*)+>J' 15 i=l (34b) ^ ^ [i: (A-y.O-^-xo-g (x,-y<) J=Pr where whenever 0<y<<a;i. Recalling that F(=(Z?i— yi)+ and -y,) "'=Pr{A<yi,|:(Z>i-yO+>yoj a+ a- dyt ay. i?=S (a;*- 2/0 i=l and combining (33) and (34) yields: y° is an optimal solution if either (35a) C% Pr{F,=0, g F,>Xo+i?)=C«+C*s [Pr {y,>0}+Pr {f,=0, g F,>Xo+i?)]' 0<y,»<a;.- or (35b) C 7 Pr or (35c) C y Pr A<0, i:F,>Xo+i?j><7«+C*s[Pr{D,>0}+Pr Di<Q,^V,>x,+R^ y,'=0 DKxi,p^V,>Xo+R^<C^+C*s^Fr{D,>x,}+-Pr^DKxutjV,>Xo+R^^' y t •''I- To help interpret (35) note that the right side of (35a) is the expected additional return plus expedited shipment cost associated with an additional unit returned from facility i to fax;ility 0, and the left side is the expected expedited transshipment cost of that same unit if it were not returned. 3.3. The Transshipment Problem Consider a multi-service center company where orders placed on a service center can be satisfied by an expedited transshipment from another service center when the original service center is out of stock. For this system we are interested in determining if transshipments done in anticipation of demand are economical. For this problem there is no first echelon and the only types of regular and expedited movements are transshipments. From the general theory, it can be shown that the optimization problem is to find a vector y=(yi, . . . , y») which minimizes 16 B. HOADLEY AND D. P. HEYMAN (36) F,iy) = Cri:{yi-x,r+C\E[±{D,-y,)+] subject to (37) i:yi=i:x,. i=l i=l The solution, y°, to this problem can be obtained by minimizing the Lagrangian function (38) i(^, x)=F3(y)-x[f:x,-i:y,J. subject to y>0, X>0, n n i=l 1=1 According to Lemma 2 we need only construct y\ X" for which either (39a) — 1^ -0 0<y,'<x, x^Vi^KJ^Xj or a+z(/, x") (39b) — §r~^-° y'°=^ or d-Z(/, X") d+L(y°, X") or (39d) '; <0 ?/.°=Z:2:,. These conditions imply (40a) Pr{A>j/,°}=.-^ 0<y,°<x, or (40b) Pr[D,>y<^}==^^±f^ xKy.'<i:x, ^ T . j=l or (40c) Pr{D,>0}<-^ y°=0 or (40d) >^°_<Pr{Z),>iJ<^^ 2/.°=^. TWO-ECHELON INVENTORY MODEL 17 or (40e) ^jP^<Pr{D,>yn yP=±Xj. The Lagrange multiplier X" can be interpreted as the "cost" of using up one of the items available for transshipment. Rewrite (40b) as and observe that for the last item received at i, the left side is the expected savings in expedited transshipments and the right side is the cost of placing them at locations i. For those locations which tranship some of their stock, the same interpretation holds with Ct—0 (because the cost of transshipping to themselves is zero) which yields (40a). So all facilities which receive trans- shipments have the same shortage probability, which is larger than the common shortage proba- bility for all locations which transship some but not all of their stock. Another conclusion that follows from these conditions is that if Pr{I><>z<}< L/x T for all i, then X''=0, y°=(xi, . . ., x„); i.e., nothing is transshipped. 4. MULTIPERIOD EXTENSION The multiperiod extensions of our model using either the discounted total expected cost criterion or the expected cost per period criterion are straightforward to formulate and in general difficult to solve. However, there is one situation where a formulation of the multiperiod problem with the expected cost per period criterion is tractable. This is when the solution to the one period problem does not depend onx=(xo, Xi, . . ., a:„); because then it is reasonable in the multiperiod problem (it may even be optimal) to restrict attention to policies that do not depend on x. The policy is specified by a vector y'>0 and the cost rate for an infinite horizon can be in terms of y, the vector of inventory levels just before the demands occur, and then minimized to find the optimal value. To illustrate this we consider the infinite horizon extension of the stock control problem. Using the policy y let C{y) be the asymptotic cost rate, L(y) be the expected holding plus shortage cost rate for the system, P(y) be the expected cost rate of purchases, M{y) be the expected cost rate for regular movements and M*(y) be the expected cost rate for expedited movements. We assume that all demand processes are stationary and all cost factors are constant with time. The assumptions made in section 3.1 are invoked here also. Clearly (41) Ciy)=Liy)+P(y)+M{y)+M*(y). From the nature of the policy we find (42a) L{y)=sE(± D,-± yX+hE(± y,-± dX> \i=l 1=0 / \i=0 1=1 / 18 B. HOADLEY AND D. P. HEYMAN and (42b) Piy) = Cpj:E(D,) i = l immediately. Since the only regular movements are shipments and in each period the number shipped to the second echelon will be just enough to replace the stock depletion from demand in the last period, (42c) M(y) = CsE^Mm(±D„ ±y^y=CsE^±D,-^± {D,-y,)J^. Since C*r<Cs* we obtain from (9), (14) and (23) (42d) M*(y)^C*rE(^± V-Vo^+iC*s-C*r)E(S*) = C*rE[± {D-y,)-'-(±D,-±yX'\ + (C%-C*r)\E]^t^^{D-y,)J-E^±D,-±y^''^- Substituting (42a) -(42d) into (41) and collecting terms yields C(y) = {Cp+Cs-h) i: E{D,)+h ± y^+(h+S-C*s) e[± D,-± yX+C*rE[±, iD,-y,)+'] (43) +iC*s-C*r-Cs)E^± iD,-ydJ- Again according to Lemma 2, to find y", we need only construct a vector for which either (44) 3CV)^o„,^.„o,„,a:cOT dyt ^' dyi ^ ' for i=0, 1, . . ., n. The vector y° may be constructed by using the methods in section 3.1 ; we omit the details. The solution is different from the single period solution, but has the same characteristics. One of these characteristics is that the probability of shortage at a second echelon facility can be quite large. This is because once an item has been shipped to the second echelon, the only way to move it to a different second echelon facility is by an expensive expedited transshipment. When the item is retained at the first echelon and is not needed during the period in question, it may be subse- quently shipped by the cheaper mode to a more needy second echelon facility. Unfortunately, the above approach does not work in general. For example, the one period solutions of the returns problem and the transshipment problem discussed in Section 3.2 and 3.3 both depend on x and therefore the analyses cannot readily be extended to the multiperiod case. TWO-ECHELON INVENTORY MODEL 19 5. ACKNOWLEDGMENT The authors would Uke to acknowledge the contributions made to this work by Alan Rolfe, who formulated a version of the returns problem which helped lead to the general problem pre- sented here. REFERENCES [1] Allen, S. G., "Redistribution of Total Stock over Several User Locations," Naval Research Logistics Quarterly, 5, 51-59 (1958). [2] Bessler, S., and A. F. Veinott Jr., "Optimal Policy for a Dynamic Multi-Echelon Inventory Problem," Naval Research Logistics Quarterly, 18, 355-389 (1966). [3] Clark, A. J., "An Informal Survey of Multi-Echelon Inventory Theory," Naval Research Logistics Quarterly, 19, 621-650 (1972). [4] and H. Scarf, "Optimal Policies for a Multi-Echelon Inventory Problem," Manage- ment Science, 6, 475-490 (1960). [5] Das, Chandrasekhar, "Supply and Redistribution Rules for Two-Location Inventory Sys- tems: One-Period Analysis" Management Science, 21, 765-776 (1975). [6] Eaves, B. C, and R. Saigal, "Homotopies for Computing Fixed Points in Unbounded Re- gions," Mathematical Programming, 3, 225-237 (1972). [7] Fiacco, A. V., and G. P. McCormick, Nonlinear Programming: Sequential Unconstrained Minimization Techniques (Wiley, New York, 1968). [8] Gross, D., "Centralized Inventory Control in Multilocation Supply Systems," Chapter 3 in Multistage Inventory Models and Techniques, H. E. Scarf et al. eds. (Stanford University Press, Stanford, California, 1963). [9] Fukuda, Y., "Optimal Disposal Policies," Naval Research Logistics Quarterly, 8, 221-227 (1961). [10] Krishnan, K. S., and V. R. K. Rao, "Inventory Control in N Warehouses," The Journal of Industrial Engineering, 16, 212-215 (1965). [11] Mifflin, Robert, "A Nonderivative Algorithm for Minimization of a Function of Several Bounded Variables," paper presented at the joint O.R.S.A. T.I. M.S. Meeting (October, 1974). [12] Simpson, K. E., Jr., "A Theory of Allocation of Stocks to Warehouses," Operations Research, 7, 797-805 (1959). [13] Zangwill, W. I., Nonlinear Programming (Prentice-Hall, Englewood Cliffs, 1969). OPTIMAL REJECT ALLOWANCE WITH CONSTANT MARGINAL PRODUCTION EFFICIENCY t Avraham Beja The Leon Recanati Graduate School of Business Administration Tel-Aviv University Tel-Aviv, Israel ABSTRACT A job shop must fulfill an order for N good items. Production is conducted in "lots," and the number of good items in a lot can be accurately determined only after production of that lot is completed. If the number of good items falls short of the outstanding order, the shop must produce further lots, as necessary. Processes with "constant marginal production efficiency" are investigated. The revealed structure aUows efficient exact computation of optimal policy. The result- ing minimal cost exhibits a consistent (but not universal) pattern whereby higher quality of production is advantageous even at proportionately higher marginal cost. I. INTRODUCTION AND SUMMARY Consider a job shop with an outstanding order for N items. Production is conducted in "runs" or "lots"* involving a set up cost and direct production cost. Some of the items in any lot may be defective, and the number of good items can be accurately determined only after the production run is completed. If the number of good items produced falls short of the outstanding order, the shop must initiate further runs as necessary. The operational problem, then, is to determine the optimal lot size that minimizes total costs incurred to fulfill the order. We tend to expect that under "reasonable" conditions the optimal lot size will be larger than the outstanding order to allow for the anticipated rejects, hence the traditional terminology of "reject allowance" or "shrinkage allowance" associated with this problem, f Our analysis of the reject allowance problem includes questions of structure and of computation. The study of structure concerns the qualitative properties (monotonicity, unimodality) of the relevant functions such as the minimal cost and the optimal policy. Computationally, the focus JThis study was supported in part by the Israel Institute for Business Research, University Campus, Ramat Aviv, Tel Aviv, Israel. *The terms run and lot will be used interchangeably. fSome conditions for this to be indeed true will become clear through subsequent results. 21 22 A, BEJA is on efficient methods for calculating the optimal decision in any given problem. The two apsects are naturally interrelated; understanding the structure allows more efficient computation, and computational algorithms provide insight on structure. The computational aspect of the reject allowance problem has attracted considerable interest (e.g. [2, 3, 4, 5, 7, 8, 9]). The analysis usually tended to concentrate on the search for approxima- tions to the cost function that would facilitate the computation. The suggested ("optimal") produc- tion decision would then be the point that satisfies the usual first order conditions for the surrogate (or approximate) cost function. Before a solution based on an analysis of this kind can be safely em- ployed, however, some quantitative measures of the degree of approximation must be established. Otherwise, there is a very little assurance that the suggested solution is in any meaningful sense "close" to the optimal lot size or the minimal cost. Unfortunately, no such measures are given in the literature. For example, Hillier [5] suggests an approximation for the first differences of the cost function in a special production process. Even if the approximation is assy mp to tic ally valid, no general assessment of the error is available for finite order sizes. Some writers (e.g. [2, 7, 8]) assume that all costs beyond that of the first production run and one eventual additional set up cost may be ignored. This is naturally restricted to cases with relatively high set-up cost, but even then it is not at all clear that production runs beyond the first one should be either few in number or small in size. Generally, the basic difficulty with all these approximations is that the analysis of the structure of optimal policy is based not only on assumptions regarding the problem's parameters, but also on assumptions regarding the optimal policy. The present study is devoted to the analysis of a class of production processes characterized by a property called "constant marginal efficiency." This includes as a special case the widely con- sidered process with constant marginal production cost and binomially distributed defectives (e.g. [2,3,5,7,8,9]). The computational aspects of the problem are addressed first. Here, we seek efficient methods for an exact solution to the problem. The model is formally presented in Section II, and it is shown that, in principle, an exact solution can be computed recursively by enumeration of all "relevant" lot sizes. It is then suggested that if a certain structure of the model is established, the range of computation will be drastically reduced. To achieve this, a Markovian decision model formulation of the problem is presented and analyzed in Section III. The direct computational implications are drawn in Section IV. Roughly speaking, we establish that (i) there is an optimal policy whereby lot sizes are strictly increasing with the outstanding order, and (ii) if for some outstanding order a lot of n+1 is inferior to a lot of n, larger lots need not be considered. Given these results, the necessary volume of computation is not only "reasonable," but indeed trivial with modern com- puting capacity even for fairly large order sizes. In Section V, we investigate the sensitivity of total cost to the quality of production. The results show that under a fairly wide range of circumstances (though not always) higher quality is advantageous even at a proportionately higher marginal production cost. Some remarks on possible extensions and applications conclude the study in Section VI. II. A STRAIGHTFORWARD FORMULATION OF THE MODEL Production is defined by a stochastic process [Xj, j=l, 2 . . .] where X,— 1 if the j'" item in the lot is good and Xj=0 if it is defective. OPTIMAL REJECT ALLOWANCE ^ 23 k denotes the number of good items in a lot of size k. Yor j=l, 2 ... let qj=P[Xj=\] be a given tech- nological property of the process, assumed independent of Xi, X2, . . . Xj_i. This property may be equivalently represented by q{j,k) = P[G,=j] or by k QU,k) = P[G,^j]-=^q{i,k) k=l,2, ... j=0, 1, . ..,k where by elementary probability theory we have for A:>1 qij, k)=qkq(j—l, k—\)-\-{l~qk)q(j, k-1) forj = 0, ... ^1 (2.1) =0 otherwise J The production cost for a lot of size k is Y(k)=a+a+. . . . +c, where Co^O is a set-up cost and Cj>0 is the expected direct (marginal) cost for producing the j'" item in a run. hj=Cjlqj is therefore the expected cost per good item produced at the jf"" "stage" in a production run, and ej=l/hj can therefore be considered the production "efficiency" at the j'" stage. It is naturally assumed throughout that Y{k) is unbounded. We define three inter-related cost functions: F{N) is the (minimal) total expected cost to fulfill an outstanding order of A'^ items when an optimal policy is carried throughout. F{N, n) is the expected cost to fulfill an outstanding order of A^' items, if the first lot is of size n and all subsuequent lots are optimal. j{N, n) is the expected cost to fulfill an outstanding order of A'^ items if a lot of size n is produced whenever the outstanding order is of size A'^ and an optimal lot size is produced whenever the outstanding order is less than A^. The three functions satisfy the following relationships: (2.2) JiN, n) = F(n)-f-2(0, n)/(A^ n)4- zJ e(i, n)F(N-j) (2.3) JiN,n)=l^Y{n)+'^^qiJ,n)FiN-j)\j{l-qiO,n)] (2.4) FiN, n) = Y{n) + S 2(i, n)F{N-j) 7=0 (2.5) F{N)= min F{N,n)= min J{N,n) n=l, 2, . . . n=l, 2, . . . F(N) clearly exists, and can in principle be computed in a finite number of steps as follows : Start with N=l. Compute /(I, n) = Y{n)/{l-q{0, n)} as in (2.3) for n = l, 2, . . . until, say, m where Y{m)^ min f{l,n) = F{N) n = \ m — 1 24 A. BEJA Clearly /(I, n)>^(iV) for 7i>m, and hence F{N)=F(N). Proceed similarly with A^=2, 3, . . . , at each stage using in (2.3) the values FiM) M=] , . . ., A^-1 computed previously. The practical difficulty with this procedure originates from the wide range of values of n for which /(A'^, n) must be evaluated at each A''. The present study helps to reduce this range for proc- esses with constant marginal production efficiency, i.e. hj=h for all j. This is achieved by establish- ing two important properties of these processes: (i) If a lot of n is optimal for an outstandiii g order of A^, then for an order of N-{- 1 there is an optimal lot r>n (monotonicity of the optimal policy) . (ii) If n is optimal for N &nd J{N, k-\-l)^f(N, k) then n<k. These properties are proved indirectly by an alternative Markovian decision model formula- tion of the same process.* III. AN ALTERNATIVE MARKOVIAN FORMULATION Define a Markov decision model as follows: for k=0, 1,2... N=0, 1,2..., the process is said to be in state {k, N) if the outstanding order is A^", set up has been achieved and k items have been manufactured but not inspected for defectives. The process operates in discrete time, and at each period the following actions are available: (i) Continue production at an immediate cost of Ck+i, with a transition to state (^+1, A'^) with probability 1. (ii) Stop production and inspect the k items already produced for the number j of good items; if j<A^ a new production run must be initiated, hence with probability q{j, k)j=0, . . . N—1, there is a transition to (0, N—j) at a cost Co, and with probability QiN, k) the process terminates.! An optimal policy is a decision rule that minimizes the expected sum of all costs until the process terminates. Let v{k, N) be the minimal expected total cost from the time the process is in state {k, N) until termination. v{k, N) clearly exists, and satisfies (3.1) vik, A^)=min [v^k, N), C^^,+v{k+\, N)} where (3.2) vKk,N)=^(i{j,k){C,+v{0,N-3)] v^{k, N) is the conditional minimal expected cost from state {k, N) until Termination, given that the process is to be inspected in the present period. LEMMA 3.1: «n^+l, N+l) = {l-q,^,)v\k, N+\) + q,^,v\k, N) PROOF: By conditioning the expected cost on Xk+i- Let H{k, N)=v'{k, N)-v'{k+\, N):i:henior N=\, 2, . . H{k, N)=v'(k, N)-{l-q,+,y{k, N)-q,+:vKk, N- 1) *Both properties are very inttiitive. Should the reader feel that this calls for a direct proof, he may find it in- structive to try proving any of thern directly for the simplest case where C, = C and g, = (/ for all j. tTo ensure termination under optimal poUcy even if C'o = 0, inspection is not allowed if Nyo and A; = 0. OPTIMAL REJECT ALLOWANCE, • 25 (3.3) H{k, N)=q,+Av\k,N)-v'{k, N-l)\ Hik+1, N) = (l-q,+^W(k, N)-{-q,+:v\k, N-\) - (l -q,+,Wik+l, N) -q,+,v'(k+l, N-l) H{k+\, N) = il-q,+2)H{k, N) + q,+2Hik, N-l)-{q,+,-q,+,){v\k, N)-v'(k, N-l)} and substituting (3.3) H{k+1, iV)={i_2,^,_£^±ipi*±il H{k, N)+q,+,H(k, N-l) (3.4) H{k+1, N)=^ {il-q,+,)H{k, N)+q,+,H{k, N-l)] Let M denote the set of all states in which it is optimal to manufacture another item and / the set of all states in which it is optimal to stop and inspect, i.e. (3.5) {k, N)tM\i v(k, N) = C,+r+v{k+l, N) (3.6) (it, iV)€/if v{k, N)=v\k, N) then for (k, N)el v\k, N)^C,+,-\-v(k+l, N)^C,+,+v'(k+l, N) (3.7) H(k, N)^C,+i for all (k, N)el and if then thus v\k, N)^C,+:+v'(k+l, N) v'{k, N)^C^+x + v{k+l, N) and {k, N)eM, (3.8) H(k, N)^C,+, implies (k, N)eM We shall also use the following important property: LEMMA 3.2: v{0, N-\-l)-v{0, N)^h PROOF: Let r= min {k: (k, N+l)iI} (clearly r exists and r>0) Then KO, 7V+l)-a+ . . . +C.+v\r,N+l) (3.9) ^,(0, A^+1) = C.+ . . . +Cr+il-qr)v'(r-l, N+l)+qriHr-h N) v{0,N+l)^C\+ . . . +Cr-, + Cr+{l-qr)v{r-l, N+D+qMr-l, N) but by recursive application of (3.1) v(0,N)^Ci-\- . . . +Cr_, + vir-l,N) and by assumption ?;(0, iV+l) = (7,+ . . . -\-C,_i+v(r-l, N+1). Hence from (3.9) v{0, N+l)^Cr+il-qMO, N+l)+qMO, N) .v(0, N+l)-viO, N)^Crlqr = h 26 A. BEJA COROLLARY 3.3: For A^=l, 2, . . . H(0, N)^C, PROOF: By Lemma 3.2 and (3.3). We now present the basis structure of the reject allowance model. THEOREM 3.4: There is a strictly increasing sequence <Cn*{N)^ of non-negative integers, such that in state {k, N) it is optimal to continue production if k<Cn*iN) and to stop and inspect if k>n*{N). PROOF: We show by induction on N that (i) for k<n*{N){k, N)eM and H{k, N)^C,^i (ii) for k^n*(N)(k, N)d and H{k, N)<C,+: (iii) n*(N-{-l)>n*(N). For 7V=0 (i) and (ii) are trivially true with n*(0)=0, because v\k, 0)=0 and H{k, 0)=0 for all A;. For A^=l, let m=min {k: H(k, l)<iC^+i\ m exists because if H{k, l)>:Ct+i for all k then (k, l)eM for all k and v{0, l) = Ci+ . . . Cn-\-v(n, 1) for all n, which is impossible because v{0, 1) clearly exists and Yin) is unbounded. By Corollary 3.3 m>0. For k^m H{k, 1)<Ca+i by induction, because if H(j, l)<Cj+i then by (3.4) H{j+1, l)<to (i_2.^,)c^.^^<£i±2 C,+:=C,+,. Let n*{l)=m. Assume inductively that (i) and (ii) are true for A'^. By Corollary 3.3 H(0, N-\-l)>:Ci, and we note that if H{k-l, N)>Ck and H{k-1, N+l)^Ck then hence, by induction on k H(k, N-{-\)>:Ck+i for /:=!, . . ., n*(N). Let r=min {k:H{k, A^+1)<C*+,} Existence of w(0, A^'+l) implies existence of r, and clearly r^n*{N). For /:>r H(k — 1, N)<^Ck and hence, again by induction on k, H{k, N-\-l)^Ck+i because if H{k — l, N+\XCk then H{k, iV+i)<2^{(i-2,)C,+2,C.}=^ C.=C.+i. Let n*(N-\-l)=r, and the theorem is proved by induction for all N. IV. STRUCTURE AND COMPUTATION The formulation of Section III is convenient for understanding the structure of the reject allowance model, but not necessarily for computing the optimal policy. It therefore remains to be seen how the results of Theorem 3.4 bear on the direct computation of optimal poHcy hy f{N, n). For iV=l, 2, . . . F{N) = Co+v{0, N), and by (2.4) and (3.2) (4.1) v\k,N) = F(N,k)-Y{k) OPTIMAL REJECT ALLOWANCE ^ 27 and if n*{N)=m F{N) = Co+v{0, N) = a + a+ . . . C^+v\m,N) = F{N,m) The (strict) mono tonicity of n*{N) thus allows the evaluations oiJiN+l, n) to start with n=n*{N) + \. Now consider H{k,N)=[F{N,k)-Y{k)}-\F{N,k+\)-Y{k+\)]=F{N,k)-F{N,k+\) + C,+, so that H{k, N)>Ck+i is equivalent to F{N, k)>F{N, k+\) and H{k, NXC^+j is equivalent to F(N,k)<F{N,k+l). Theorem 3.4 therefore establishes that F(N, n) is "quasi-convex," since FiA^, n+1) r<iF{N, n) for n<in*(N) and FiN, n-\-l)yF(N, n) for n>:n*{N). Computationally, however, the interest is in i\N, n) rather than F{N, n), and one further step is necessary. THEOREM 4.1:/(A^, n+l)>/(.V, n) implies F{N, n+l)>FiN, n). PROOF: By (2.2) and (2.4) m, n+l)-f{N, n)^F{N, n+l)+g(0, n+l)f{N, n+1) -5(0, n-{-l)F{N)-FiN, n)-q{0, n)f{N, n)+q{0, n)F(N) UN, n+l)-f{N, n)=F{N, n+l)-F{N, n)+qiO, n+1) {/(iV, n+l)-/(iV, n)} -{g(0, n)-qiO, n+l)}/(iV, n)+{q{0, n)-q{0., n+l)}F(N) (4.3) {2(0, n)-q{0, n+l)]{J(N, n)-FiN)} + {1-2(0, n+l)}UiN, n+l)-J{N, n)}=FiN, n+l)-FiN, n) The first term on the left hand side of (4.3) is clearly non-negative, and if the second term is positive the right hand side must be positive. Q.E.D. Note also that at n=n*(N) the right hand side of (4.3) is positive, the first term on the left hand side vanishes, and hence /(A^, n+l)>/(A7', n). Theorems 3.4 and 4.1 thus establish a very effective upper bound for the computations of i{N, n), which always stop at n*{N)-\-\, where f{N, n)—f{N, n—1) first becomes positive. If the values of qj are not extremely small there may be no need for more than just a few evaluations of f{N, n) for each N. Experimentation suggests that for N as high as 100 the computations need usually take no more than a few seconds with a large scale computer. V. THE EFFECTS OF QUALITY ON COST Returning to the general structure of the reject allowance problem, the next question to be considered concerns the relationship between F{N) and the system's parameters. Processes with constant marginal efficiency can be identified by three elements: the set up cost Co, the marginal eflBciency e or, equivalently, its reciprocal value h, and the sequence (q}=qu 22, • • ■ which repre- sents the "quality" of production, with a lower percentage of defectives for higher (2). Given (2) 28 A. BEJA and Co, F{N) clearly increases with h, and, similarly, given (g) and h, F{N) increases with Co- The question, then, is: given Co and h, how does F{N) depend on (5)? If 6'o=0, the answer is immediate. THEOREM 5.1: If Co=0 then F{N)=Nh (regardless of (g)). PROOF: F{N)^F{M)+F{N-M) for M<N, hence F{N)^NF(1). But Fil)^fil, 1), and when Co = then by (2.3) /(I, l) = Ci/qi=h, hence F(N)r^Nh. By recursive applictaion of Lemma 3.2 i^(iV) ^A^/i, and hence FiN)=Nh. Q.E.D. It is easy to see (e.g. by a numerical example such as the one given at the end of this section) that, generally, F{N) need not be independent of (g). In this section we investigate the effects of quality on total cost by making various comparisons at different levels of generalization between processes with the same (arbitrary) Co and h. The comparisons exhibit a consistent pattern whereby lower quality is more costly. As we shall see, however, this statement must be accepted only in the exact sense of the theorems proved below; a careless "universal" interpretation is false. One way to compare the costs associated with two production processes is by formulating a decision problem where, before each lot is produced, a choice is available as to which of the two processes to use for that lot. If an optimal policy always uses the same process, then certainly this process is (costwise) at least as good as the other. This approach is not only methodologically convenient, but also firmly rooted in the motivation underlying the analysis. We shall make repeated use of Howard's policy iteration principle.* For our purposes, this well known principle may be conveniently summarized in rough words as follows: policy / is preferred (strictly preferred) to policy // if, and only if, using policy / once and policy // afterwards is preferred {strictly preferred) to policy //. THEOREM 5.2: Let production process / be specified by Co, h and (q), and process II by Co, h and (p), where pi=qiioT i^^n and p„>sup{gt, i—n, n+1, . . .}. Then Fji{N)^Fi{N). PROOF: It is convenient to introduce a hypothetical third process — process — specified by Co, h and {q°) where q„°=0 and qi'=qi for i9^n.\ Consider a decision problem involving a choice among the three processes. Fii{0) = Fi{0) = F„{0). Assume inductively that for M=0, 1, . . AT^— 1 process // is optimal and Fn{M)^Fj{M), Fn{M)^Fo{M). For N, let a non-stationary policy A use process / for the first lot and process // for all subsequent lots, at a total cost of F'^^N, k) if the size of the first lot is k. Similarly, policy B uses process once and process // afterwards. Let r be optimal for A and m for B, i.e. (5.1) F^{N,r)= min F^{N,k) fc=i, 2, . . . (5.2) F^{N, m)= min F^{N, k) k = l, 2, . , . We shall show that Fu{N) ^F^{N, r) and FniN) ^F^{N, m), so that by Howard's principle process // is optimal for A^^. F^{N, m) = Yo{m)+jiqo{i, m)Fn{N-i) i=0 *Cf. Howard [6] or Blackwell [1]. tThis is clearly just a convenient way of assigning the indices in a process which is essentially defined by C^, K <«)>, where Wi = 9,<' for i<n—\ w, = q, + l^ for i<n, with u),>0 for all i. OPTIMAL REJECT ALLOWANCE 29 SO that if m<Cn F^(N, m) = Fn{N, m)^Fu{N), and if m=n F^(N, m) = Fn{N, n-l)^Fn(N). If m>n+l, then by (2.1) F^{N, m)=F„(m-l) + 2,/i+zJ{(l-2m)2o(i, m-l) + 2,2<,(t-l, 7n-l)}Fjr{N-i) (5.3) F^iN, m)=F^{N, m-l)+2„D(m-l) where (5.4) I>(m-l)-A-2 koih m-l)-2„(i-l, m-l)]FjjiN-i) Equation (5.2) implies that Z?(m— 1)::S0. Consider Fu{N, m— 1) as defined in (2.4) Fjj{N, m-l)=F„(m-l)+ Z: 5//(^, m-l)FjriN-i) i=0 An immediate extension of (2.1), again by elementary probability theory, established that for k^n qnii, k) = {l—pn)qo{i, k)+p„qo(.i—'i., k) and thus Fjj(N, m-l) = Y,{7n-l)+p,h+^ {il-p„)qXi, m-l)+p„2„(i-l, m-l)}Fn(N-i) 1=0 FsriN, m-l)^F^iN, m-l)+p„Dim-l) and by (5.3) FuiN, m-l) = F^(N, m) + ipn-q„)Dim-l) D(m— 1):$0 and Pn^qm for m>:n+l imply that (5.5) F'^iN, m) ^Fi,{N, m- 1) ^Fn{N). Now consider and if r^n—l For r^n. F^(A^, r) = F,(r)+ii2,(i, T)Fn{N-i) 1 = FHN,r)^F„iN,r)^FjAN). F^{N, r) = Yo{r) + q^+j: {{l-q,)q,{i, r) + q,q,{i-l, r)]Fn{N-i) i=0 (5.6) F^(A^, r)=i^^(A^, r) + 2„I>(r) 30 A. BEJA where D{r) is defined as in (5.4). Also Fn{N, r) = YXr)+Vnh+j:{{l-Pn)qo{h r)+p,,g,{i-\, r)]F,iN-i) i=0 Fu{N,r)=F^iN,r)+p„D{r) and by (5.6) (5.7) FrjiN, r)=F^iN, r) + {p„-qn)D{r) If Z)(r)<0 then (5.7) and p„^5„ imply F^'iN, r)^Fii{N, r)^Fn{N), and if D{r)^Q then (5.6), (5.2) and (5.5) imply F-\N, r)^F^{N, r)>F^{N, m)^Fu{N). We have thus proved that Fn{N)^F^{N,r) as well as Fji{N):SiF^{N, m), process // is optimal for A^, and in particular FniN)^Fr{N), so that the proof is valid by induction for all N. It has already been established earlier by Lemma 3.2 that an option to obtain a good item at cost h separated Jrom the production process (i.e. between two production runs) always pays, since F{N)^F{N—l)-{-h. Theorem 5.2 allows an extension of this property, which may perhaps be best formulated immediately in the operational context. COROLLARY 5.3: An option to obtain a good item at cost h at any stage of the production process always pays. PROOF: Let the basic quality sequence of the process be (w). If the option involves perfect production of, say, the n'^^ item, let {q)={w} and let (p) be defined by ^„=1, Pi=Wi for ir^n, and by Theorem 5.2 the option pays. If the option allows the good item to be obtained between the n — l and the n^^ item in a run, let {p) and {q) be defined by pi=^qi=w, for i:<n — 1, ^«=1, 2n=0 and pi — qi=Wi-i for i=;n+l. Theorem 5.2 again holds. Q.E.D. The next theorem generalizes the "advantage of quality" presented in Theorem 5.2. THEOREM 5.4. Let process / be specified by Co, h, and (q), and process // by €„, h, and {p). If 2<>2t+i and2).^2ifori=l, 2, . . . t\ven Fj,{N)^Fj{N). PROOF: Define a sequence of processes, indexed by n—2, 3, . . ., with €„, h and (w") where w»j=^ifori=l . . . 71-1 and w"i=g, for i=n,n+l, .... Then by Theorem 5.2 7^2(iVX-fV(A^) and Fn+i(N):<:Fn{N) for n=2, 3, . . . . Since the range of lot sizes that are relevant for N is bounded, (p) is operationally equivalent to (it;") for n sufficiently large and Fri(N):<Fj{N) Q.E.D. Processes with non-increasing (q) as in Theorem 5.4 are analogous to the notion of "increasing failure rate" in reliability theory in the sense that negative effects of time and wear dominate production. When effects of "running in" and "learning" are dominant, non-decreasing quality sequences (q) are of interest. Here the "advantage of quality" take a slightly different form. THEOREM 5.5: Let process / be specified by Co, h, and (q), and process // by C,,, h, and ip). If gi^2,+i and Pi=q,+n for all i and some (natural) n then Fii{N):<Fi{N). PROOF: If suffices to consider n=l, because for n=2, 3, . . . the theorem then follows immediately by induction. OPTIMAL REJECT ALLOWANCE. 31 Consider the decision problem involving a choice between the two processes. Fij{0) = FjiO), and we assume inductively that for M=0, . . A''—! process // is optimal and Fii{M):^Fj{M). Suppose that process / is optimal for A^, and that the corresponding optimal lot size is r. Let policy A use process // for the first lot and an optimal policy for all subsequent lots, with a total cost of F^{N, k) if the size of the first lot is k. By assumption F(A^) = F,(r)+Z: 2/(^, r)F{N-i) i=0 F(A^) = r,,(r-l) + 2iA+Z; {(l-2i)2//(t, r-l) + q,qn{i-l, r-l}F(N-i) 1=0 (5.8) F(N) = F^{N, r-l) + q,Duir-l) where Also Dn{r-l)=h-j: {qnii, r-l)-qrr{i-l, r-l)}FiN-i) 1 = F^{N, r)=F,,(r)+X: 2//(^, r)F{N-i) !=0 F^iN,r)^Yrr{r-l) + qrh+j:{{\-qr)qrr{i,r-l) + qrqn{i-hr-l)]FiN-i) 1 = (5.9) F''iN,r)=F''{N,r-l) + qrDjj{r-l) and by (5.8) (5.10) -Q FHN, r)=F{N) + (qr-q:)Du{r-l) UDji(r-l)^0 then by (5.8) F'^iN, r-l)^F{N) and ifDii(r-l)^0 then (since g.^^i) by (5.10) F'*(iV, r):<F(N). Hence by Howard's principle process // is optimal for N, Fn{N):<Fj(N) and the theorem proved by induction on A^ and on n. It is tempting to try to extend the fairly general advantage of quality of Theorem 5.4 to arbi- trary quality sequences (q), or, in view of Theorem 5.5, at least to non-decreasing sequences. This extension is false, however, as demonstrated by the following numerical example. AN EXAMPLE WITH DISADVANTAGE TO QUALITY: Let Co=h = l, {q)=0.1, 1, 1, . . ., ('p) = 0.2, 1, 1 ... . The first process involves a lower quality at the first stage of production, but still i^7(l)=2.1 whereas for the second "higher quality process" Fjril)=2.2'>Fj(l). VI. EXTENSIONS, APPLICATIONS AND LIMITATIONS The formulation of (2.2) through (2.5) does not include an explicit "salvage value" for defective items or for good ones produced in excess of the outstanding order. Nevertheless, if items of both kinds have the same salvage value, the process still falls within the scope of our model by a straight- forward transformation. Let the index s refer to a process with salvage value S, then (6.1) F,iN,n) = Ysin)+'Sq{i,n){F,{N-i)-{n-i)S}-Q{N,n)in-N)S i=0 32 A. BEJA define Fr(N,n)=F,(N,n)-NS YT(n) = YM-nS (or equivalently Cj''=C/-S) and then (6.1) becomes FriN,n)+NS==Yr(n)+nS+j:q{i,n){Fr{N-i) + {N-i)S-(n-i)S}-Q{N,n)(n-N)S Fr{N,n)+NS=Yrin)+nS-\-ti^ii'^)FT{N-i) + {l~Q(N,n)}iN-n)S-Q{N,n){n-N)S «=o (6.2) Fr{N, n)=Yr{n)-\-^q(i, n)FriN-i) 1=0 and (6.2), which is equivalent to (2.4), solves the problem because minimization of Ft(N, n) is clearly equivalent to minimization of Fs{N, n). ^^(n) is the "excess production cost" of a lot of n items (beyond its salvage value) and it should be emphasized that the transformation is useful only if "constant marginal efficiency" applies to the excess marginal cost C;^. Ft(N) is the excess cost of an order for A^ good items, beyond their salvage value. In Wadsworth and Chang's model [9] the first set up cost need not equal the cost of sub- sequent set ups. This variation does not affect our analysis in any way, because the difference in cost between the first and subsequent set ups may simply be charged to F{N), with a constant set up cost considered throughout. If the technological or economic parameters change between runs, let consecutive runs be indexed by <=1, 2, . . .so that (3.2) becomes Monotonicity of nt*{N) and quasi-convexity of F^iN, n) still hold, because Theorem 3.4 is true for all t, as can be verified by careful inspection of the proofs involved. This interesting struc- tural property has little, if any, computational value, however, because Ft+x{M) must be evaluated for M=0, . . ., N — 1 before ji{N, n) can be computed. The unavoidable limitation of the model is, of course, the inherent restriction to processes with constant marginal efficiency. This domain naturally includes the dominantly popular q_j = q, and Cj=c for all j. If 5^ varies during production, the model still includes all cases where direct cost is proportional to the number of good times produced, rather than the total number of items. It can serve as a fairly good approximation when quality incentive payments are a dominant part of direct production cost, or when a machine's power consumption — although varying with time during production — strongly aifects both quality and cost, etc. Within the context of constant marginal eflSciency, effects of "running-in" and "learning" (increasing g;) or "fatigue" and "wear" (decreasing q,) are certainly allowed in unlimited variations. An interesting special application concerns contracting. Suppose A^^ good items are needed, and an agreement with a subcontractor can be secured whereby an order for the delivery of n items is OPTIMAL REJECT ALLOWANCE 33 placed under the understanding that only good items are paid for, at constant price h (independent of n). Then if q(j, n) is the probability that out of n items delivered j items are good and Co is the cost of placing the order, the model applies. The study of more general processes, where changes in marginal cost reflect more than changes in 5;, is certainly of interest. It is doubtful, however, that the structure of such processes is nearly as powerful for the computation of an exact solution as in processes with constant marginal efficiency. In particular, it can be readily verified that F{N, n) need not in general be quasi-convex, and a local minimum does not guarantee a global minimum. For processes with decreasing marginal efficiency we do not even expect that n* {]SF)^n* {N — \) , nor indeed that n*{N)^N. With constant marginal efficiency n*(N)y'n*(N~l) insures that n*{N)>^N, so that at least for that class the term "reject allowance" is justified. REFERENCES [l] Blackwell, D., 'Discrete Dynamic Programming," Annals of Mathematical Statistics, SS, 719-726 (1962). [2] Bowman, E. H., and R. N. Fetter, Analysis for Production Management, Revised Edition (Richard D. Irwin, Inc., Homewood, Illinois, 1960), 324-330. [3] Goode, H. P., and S. Saltzman, "Computing Optimum Shrinkage Allowances for Small Order Sizes," The Journal of Industrial Engineering, 57-61 (Januarj^-Februar}^, 1961). [4] Gregory, W. R., and A. Beged-Dov, "On the Determination of Optimal Shrinkage Allowance in a Job Shop," The Journal of Industrial Engineering (April, 1967). [5] Hillier, F. S., "Reject Allowances for Job Lot Orders," The Journal of Industrial Engineering, 311-316 (November-December, 1963). [6] Howard, R. A., Dynamic Programming and Markov Processes (John Wiley and Sons, New York, 1960). [7] Levitan, R. E., 'The Optimum Reject Allowance Problem," Management Science, 6, 172-186 (1960). [8] Llewell3'n, R. W., "Order Sizes for Job Lot Manufacturing," The Journal of Industrial Engineer- ing, 176-180 (May-June, 1959). [9] Wadsworth, H. M., and S. H. Chang, "The Reject Allowance Problem: An Analysis and Application to Job Lot Production," The Journal of Industrial Engineering, 127-132 (May-June, 1964). A CHANCE-CONSTRAINED DISTRIBUTION PROBLEM Richard M. Reese and Andrew C. Stedry School of Economics and Management Oakland University Rochester, Michigan ABSTRACT The transportation model with supplies {S,) and demands (D,) treated as bounded variables developed by Charnes and Klingman is extended to the case where the Si and Dj are independently and uniformly distributed random variables. Chance constraints which require that demand at the jth destination will be satisfied with probability at least /3, and that stockout at the tth origin will occur with probability less than a; are imposed. Conversion of the chance constraints to their linear equivalents results in a transportation problem with one more row and column than the original with some of the new arcs capacitated. The chance-constrained formu- lation is extended to the transshipment problem. INTRODUCTION Developments in network models have evolved from the first attempts by Dantzig [5] to solve transportation models using the simplex method through the stepping stone method of Charnes md Cooper [9] to recent advances in solution techniques for generalized networks (Balas [1], Balas md Hammer [2, 3, 4,], Charnes and Kirby [11], Charnes and Raike [12]). Considerable emphasis las been placed on developing efficient computational techniques. Lemke's [22] dual method, 3antzig's row-column-sum method (see [16]), Orden's [24] characterization of the transshipment )roblem as a transportation model, Wagner's [27] techniques for capacitated networks, Ford and I'ulkerson's network algorithms ([17] and particularly the out-of-kilter technique [18]) and Vogel's ipproximation method [25] represent significant advances. Recent developments include investiga- ions of efficient dual methods by Glover, Klingman and Napier [19], computation of efficient initial lolutions (Glover, Klingman and Napier [28], Napier [23], Glover, Karney, Klingman, Napier 21]), and an improvement of the out-of-kilter method (Barr, Glover and Klingman [5]). The great strides made in computational methods for network problems in the past twenty '^ears, coupled with comparable advances in computer technology in the same period, have resulted a special purpose computer codes that can solve extremely large network problems in seconds. 'Japier [23] reports solution times in the neighborhood of 30 seconds on a CDC 6600 for networks f 100 nodes (dense) and 200 nodes (non-dense) with up to 10000 arcs. Also, more recently, Ross, 35 36 R. M. REESE AND A. C. STEDRY Klingman and Napier [26] have examined the effect on the computational efficiency of various problem dimensions in transportation problems such as the number of variables, rectangularity, density, number of constraints and the variance and skewness of its objective function. It is reasonable to conclude that these special purpose algorithms can solve problems as large as will be encountered in actual networks. In an industrial application for, say, a monthly shipment Schedule many minutes, if not hours, might otherwise be devoted to solving a very large problem. STOCHASTIC DEMANDS AND SUPPLIES Thus far, developments in network models have been in the main deterministic. Charnes and Kirby [11] present a number of chance-constrained formulations which might be applied to net- works. Our aim here, however, is to present a special purpose technique for chance constraints applied to transportation or transshipment problems. Briefly, the model permits chance con- straints on shipments such that the probability that all demands be met at a destination is con- . strained below; the probability of a stockout at an origin is bounded above. I We shall proceed from the standard transportation model to Charnes and Klingman's [13] treatment of supply and demand as bounded variables. From there, the extension to the chance- constrained model is quite natural and results in a tableau which is structurally identical to the bounded variables case. THE CAPACITATED DISTRIBUTION MODEL The chance-constrained distribution model is an adaptation of the Charnes and Klingman [13] modification of the distribution problem to encompass upper and lower bounds on the requirements. Let Xij be the amount flowing from node i to node j, c,_^ the cost of that flow, St the total supply available at node i, and Dj the total demand at node j. Fir 11 y, let I={i\i=\, . . ., m] and «^={ili— 1> • • ■) ^}- The standard transportation problem is then: Minimize ]^ Xj <^ij^tj Subject to: 2j a;,,=(S'„ id (1) i:x,,=D„jeJ UI Xij>0, iel, jej Charnes and Klingman consider Si and Dj to be bounded variables, thus allowing some flexi- bility in the distribution program. In effect this permits transportation costs to determine some- what the requirements at the nodes. Designate Sj as the upper bound on Dj and Dj as the lower bound. Then Dj>Dj<Dj and, similarly, Si<Si<Si for all id, jeJ. Now, append / and J so that 7'={i|i=l, . . ., m+1} and J'={iii=l, . . ., n+\]. Consider a destination node where (2a) ^Xij=Dj UI CHANCE-CONSTRAINED DISTRIBUTION 37 where Dj<Dj<Dj Let Xm+i, j=Dj—Dj. Equation (2a) becomes (3) 2-i ^ij'T'Xm+l.j^^-L'j. and the bound conditions (2b) can be rearranged so that (4a) D^<D,-^0<Dj-Dj^-x„+i, , (4b) Dj<Dj^-Dj>-Dj^D-D^>D-Dj=Xrn+„ , or, combining (4a) and (4b) (5) 0<x„+i, j<D-Dj An analagous derivation holds for Si<Si<Si which results in (6) 0<X„ n+l<S,-Sj Thus the modified transportation problem is a capacitated distribution problem of the form: Minimize : ur jtj' Subject to: y^. Xij=Si iel UJ' XI Xij=Dj jej itV uV ill 0<Xm+\.j<Dj—Dj jej 0<Xr.„+i<S-Si id Xi)>Q iel,jej Note that the conditions on the summations of the capacitated variables assure that ul' UJ' and also provide directly the actual total flow through the system by way of x„+i. CHANCE-CONSTRAINED ADAPTATION Suppose, instead of treating the supplies and demands as bounded variables, we assume that the Su UI are uniformly and independently distributed random variables in the intervals [Su S,] n+l- 38 E. M. REESE AND A. C. STEDRY and the D,, jeJ independent uniform deviates in the intervals [Z>y, D^. We now insist that all demands at the j"" node be met with at least probability /Sy, or (8) P{J:x,,>D,}>&, Such chance constraints are readily converted to linear inequalities for uniform deviates (cf . Chames Cooper and Symonds [10] and Charnes and Cooper [6]). Let x be uniformly distributed in [a, h]. Then b—a F{x)= T dy=-i Jo b—a -^ b—a so that becomes Since (8) is of the form we can apply (9) to yield: Let ^w-f=i^' x>a-\-^{b—a) FAi:x,j)>0j UI UI — — So that (9) becomes (10) J:.x^j>Dj UI We can assume that no purpose is served by shipping more to node j than the maximum possible demand. Hence, (11) Dj<^Xij<D, id The concern at the source nodes is that there not be a stockout, i.e., that St will satisfy the shipping requirements from the node. Let at be the permitted stockout probability so that l-a^ is the desired probability that Si |i|| exceed the programmed shipments from node i. The chance constraint can be expressed as p{Si>j:x^,}>i-a, if ^■'•^ llll l-^,(Z;^u)>l-a. i^.(z;^o)<«.- f! (12) T.Xij<S,+aASi-S,) f CHANCE-CONSTRAINED DISTRIBUTION 39 Let We assume that it is undesirable to ship less than Si, the minimum availability at node i. Thus, (13) Si<j:Xij<S\ j(J From (11) and (13) it is clear that the linear form of the chance constraints results in a dis- tribution problem where y^, x,j and y^. X < are bounded variables. Hence we can, by substituting Dj for Dj and S*j f Sotj in (7), express the chance-constrained distribution problem as : Minimize : Subject to: UV jtJ' (14) S Xu-=D, UV where Cij^Ectj. UV UI < Xm+ i,j<Dj— Dj jiJ 0<a;,-,„+i<S*-S( iel Xij>0 ielfjij EXTENSION TO THE TRANSSHIPMENT PROBLEM The conversion of the transshipment problem to a transportation model has been shown formally by Orden [25] so we shall proceed here by example. Our conversion differs slightly from Orden's in that he deals with net flows at transshipment nodes by altering either the demands or supplies by the net amount whUe, for our purposes, altering the demand or the supply for nodes at which, respectively, demands and supplies exist is more satisfactory. Consider the network of Figure 1(a). Node 1 is a pure source, node 4 is a pure sink while 2 and 3 are transshipment nodes. A zero-cost feedback flow at the transshipment nodes can be intro- duced, as shown in Figure 1(b) without altering the problem. The network of Figure 1(b) can be represented as a distribution problem as shown in Figure 2(a). 40 R. M. REESE AND A. C. STEDRY a; (b) Figure 1. — A simple network. 13 (d) 17 15 13 (b) Figure 2. — Transportation model form of the simple network. In the form of Figure 2(a), however it is possible that the feedback flows can take on negative values. This can be prevented by insisting that the total flows out of node 2 (at which a demand exists) equal the maximum of the possible inputs to node 2, viz. the sum of the total supplies to the system. Similarly the inputs to node 3 (at which there is supply) are equated to the total output from the system. The altered distribution problem is shown in Figure 2(b) where 15, the total flow, is added to both the inputs to and outputs from the transshipment nodes. CHANCE-CONSTRAINED DISTRIBUTION 41 In algebraic form, the Kirchoff node conditions for the network of Figure 1 (a) are shown in Table 1 . Multiplying the equations for nodes at which demands exist by — 1 we obtain the relations in Table 2. We observe that the pure source (1) and the pure sink (4) are already in transportation problem form. We can replace the equations for the transshipment nodes (2) and (3) by pairs of equations whose difference yields the original equation, viz. Table 1. — Kirchoff Node Conditions for the Simple Network \Arc-^ (1,2) (1,3) (1,4) (2,2) (2,3) (2,4) (3, 2) (3, 3) (3,4) Node 1 10= X12 +X13 -i-Xu 2 — 2=— Xi2 +a;23 +2:24 — 2^32 — a;34 3 5= — X:3 — X23 + 2:32 +X34 4 -13= -Xu — 2^24 -X34 Table 2. — Altered Kircho_ff Node Conditions \Arc^ (1,2) (1,3) (1,4) (2,2) (2,3) (2,4) (3,2) (3, 3) (3, 4) Node +2^13 +2;i4 — 2:23 — X24 4-X32 X13 — X23 ~rX32 +2:14 +X24 +2:34 +2:34 +2:34 1 2 3 4 10 = Xi2 2=Xi2 5= 13= 15= 17=Xi2 and 15= 20= X22 ~rX23 "TX24 -f"X22 ~rX32 2^13 "rX23 "T2;33 T'2;32 +X33 +X34 In each case we have added an equation and a new variable representing the feedback node thus leaving the determination of the system unchanged. The problem constraints can be represented in tableau form as shown in Table 3. The non-existent Arc (1, 4) is shown with an arbitrarily large :ost, M and the feedback nodes with cost. In brief, the conversion here first sets Si equal to the supply at node i for pure sources and Oj equal to demand for pure sinks. Then B= 77? jtj S computed and the demand at transshipment node j is set equal to 5+min (0, D^) and the supply squal to B+min (0, St). 42 R. M. REESE AND A. C. STEDRY Table 3. — Simple Network in Transportation Model Form From — To 2 3 4 s, 1 M 10 2 15 3 20 D, 17 15 13 45 ( THE CHANCE-CONSTRAINED TRANSSHIPMENT MODEL Let /i be the set of pure source nodes, J3 the set of pure sinks and let I2 and I3 represent trans- shipment nodes which are sources and sinks, respectively, and Ji and J2 the destinations which are source and sink transshipment nodes. Clearly the problem : Minimize : Subject to: (15) ^S§"" Xij id I UJ idz idi S ^^^-^ jtJi Z) x,,=B+D, jfJ2 z; x,,=D, UI j^Js Xij>0 id, j^J The chance-constrained problem, by substituting B-\-Si or B+D, for S, and Dj in (8) and (12) as required is readily comprehended as Minimize : Subject to: ]^i ii: z; x,s^s* idi CHANCE-CONSTRAINED DISTRIBUTION 43 (16) idz jtJx jiJ2 ^X„=S, jfJs UJ' UJ =s:+ 0<x„+i, y jeJi 0<x,,^+^<S*i-St idJJh 0<a;<,„+i idz Xij>0 iel, jtJ where, as usual, it is assumed that c<y and x<^ are independent so that ECi^Xij—Ci^Xij. CONCLUSIONS The addition of chance constraints to the transportation and transshipment models has been accomplished with only a trivial increase in computational requirements. In an mXn transporta- tion model the conversion involves the addition of m-\-n-\-l arcs where at most max (m, n) of these are capacitated. Thus, large scale chance-constrained distribution models can be solved using already available computer codes. RECOMMENDATIONS FOR FURTHER RESEARCH The chance constraints investigated here with uniform random deviates can be adapted readily to any density of the St and D^ provided, of course: (1) the variables are independently distributed; (2) the density functions are non-zero only in a finite interval (i.e., truncated above and below) ; and (3) the inverse cumulative distribution function can be computed or derived by Monte Carlo methods. The triangular and beta densities immediately present themselves as do empirically derived densities which are inherently truncated. In its present form the model belongs to the class of zero order decision rule chance-constrained models. Thus, further work must be done to expand the model in a dynamic context to maximizing over a multi-period horizon. 44 R. M. REESE AND A. C. STEDRY BIBLIOGRAPHY Balas, Egon. "The Dual Method for the Generalized Transportation Problem," Management Science, 12, 555-568 (1966). Balas, Egon and P. L. Hammer (Ivanescu), "On the Generalized Transportation Problem," Management Science, 11, 188-202 (1964). Balas, Egon and P. L. Hammer (Ivanescu), "On the Transportation Problem — Part I," Cahiers du Centre d'Etudes de Recherche Operationelle, 4, No. 2 (1962). Balas, Egon, and P. L. Hammer (Ivanescu), "On the Transportation Problem — Part II," Cahiers du Centre d'Etudes de Recherche Operationelle, 4, No. 3 (1962). Barr, R. S., F. Glover and D. Klingman, "An Improved Version of the Out-of-Kilter Method and A Comparative Study of Computer Codes," Mathematical Programming, 7, 60-86 (1974). Charnes, A. and W. W. Cooper, "Chance-Constrained Programming," Management Science, 6, 7.3-79 (1959). Charnes, A. and W. W. Cooper, "Deterministic Equivalents for Optimizing and Satisficing under Chance Constraints," Operations Research, 11, 18-39 (1963). Charnes, A. and W. W. Cooper, Management Models and Industrial Applications of Linear Programming. (New York: John Wiley & Sons, Inc., 1961). Charnes, A. and W. W. Cooper, "The Stepping Stone Method of Explaining Linear Program- ming in Transportation Problems," Management Science, 1, No. 1 (1954). Charnes, A., W. W. Cooper, and G. H. Symonds, "Cost Horizons and Certainty Equivalents: An Approach to Stochastic Programming of Heating Oil," Management Science, 4, 235- 263 (1958). Charnes, A. and M. Kirby, "The Dual Method and the Method of Balas and Ivanescu for the Transportation Model," Cahiers du Centre d'Etudes de Recherche Operationelle, 6, No. 1 (1964). Charnes, A. and M. J. L. Kirby, "Some Special P-Models in Chance-Constrained Program- ming," Management Science, 14, 183-195 (1967). Charnes, A and D. Klingman, "The Distribution Problem with Upper and Lower Bounds on Node Requirements," Management Science, 16, 638-642 (1970). Charnes, A. and W. M. Raike, "One-Pass Algorithm for Some Generalized Network Problems," Operations Research, U, 914-924 (1966). Dantzig, G. B., "Application of the Simplex Method to a Transportation Problem," in T. C. Koopmanns (ed.), Activity Analysis of Production and Allocation, Cowles Commission Monograph No. 13. (New York: John Wiley & Sons, Inc., 1951). } Dantzig, G. B., Linear Programming and Extensions (Princeton, N.J. : Princeton University Press, 1963). Ford, L. R., Jr., and D. R. Fulkerson, Flows in Networks (Princeton, N.J. : Princeton Uni- versity Press, 1962). Ford, L. R., Jr., and D. R. Fulkerson, "An Out-of-Kilter Method for Minimal Cost Flow Problems," SIAM Journal, 9, No. 1 (1961). Glover, F., D. Klingman and A. Napier, "An Efficient Dual Approach to Network Problems," Working Paper 71-57, The University of Texas, Austin, Texas (May 1971). I! CHANCE-CONSTRAINED DISTRIBUTION 45 [20] Glover, F., D. Klingman and A. Napier, "A One-Pass Algorithm to Determine a Dual Feasible Basic Solution for a Class of Capacitated Generalized Networks," Center for Cybernetic Studies, Research Report 42, The University of Texas, Austin, Texas (October, 1970). [21] Glover, F., D. Karney, D. Klingman and A. Napier, "A Computation Study on Start Pro- cedures, Basis Change Criteria, and Solution Algorithms for Transportation Problems," Management Science, 20, 793-813 (1974). [22] Lemke, C, "The Dual Method of Solving Linear Programming Problems," Naval Research Logistics Quarterly, 1, No. 1 (1954). [23] Napier, H. A. Jr., "Some Algorithmic Procedures for Networks and their Computational Relationship with Existing Network Algorithms." Doctoral Dissertation, The University of Texas, Austin, Texas (May 1971). [24] Orden, A., "The Transshipment Problem," Management Science, 2, 276-285 (1956). [25] Reinfield, N. V. and W. R. Vogel, Mathematical Programming (Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1958). [26] Ross, Terry G., D. Klingman and A. Napier, "A Computational Study of the Effects of Problem Dimensions on Solution Times for Transportation Problems," Journal of the Association for Computing Machinery, 22, 413-424 (1975). [27] Wagner, H. M., "On a Class of Capacitated Transportation Problems," Management Science, 5, 304-318 (1959). [28] Wagner, H. M., Principles of Operations Research. (Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1969). ELEMENTS OF A THEORY IN NON-CONVEX PROGRAMMING Claude-Alain Burdet SYSTEMATHICA Consulting Group Ltd. Pittsburgh, Pennsylvania ABSTRACT The question of necessary and sufficient optimality conditions for non-convex programs is analyzed in the general context of subadditivity. Several types of convex set extensions are investigated to generate valid inequalities from the corresponding gauge functions. 1. SUMMARY We first present a set of "naive" optimality conditions (necessary and suflBcient) applicable to a general mathematical program. An example and its Kuhn-Tucker conditions are described to show why such necessary conditions are impractical in a non-convex situation. We then proceed with necessary conditions in inequality form and show how this concept is related to that of subadditive gauge functions (see Section 3.2). In Sections 3.3 and 3.4, we generalize Tuy's intersection method and describe the construction of cutting planes primarily based on the objective function; it is shown that Tuy's cuts are uniformly dominated in the present framework. Dominance is strict when the level set of the objective function is unbounded. We next investigate Tuy's idea of convex extension. A new formulation allows relaxation of the convexity assumption for the objective function and produces quasi-convex extensions which subsume Tuy's concept. In fact extensions can be defined in very general terms (see Section 3.6); and any valid inequality (including the most stringent ones) can be cast in this framework. Section 4 investigates an analytical characterization of the concept of extension and the pos- sibility to explicitly construct the corresponding cutting planes. This is obtained from convex, quasi-convex and/or polaroid gauge extensions. The basic thrust of our research is to obtain results stronger than Tuy's cuts which have proved disappointing. We use subadditive gauges on the one hand and extensions on the other as a vehicle to circumvent the obstacles met by Tuy's concepts and methodology. Although no algorithmic implication is discussed here in depth, the general direction of this study is aimed at solving some typical problems in the difficult area of non-convex optimization. 2. OPTIMALITY Consider the mathematical program Maximize /(x) subject to xeXci?" 47 48 C. BURDET One has the immediate necessary and sufficient optimality conditions: (2.1) xeX is optimal] with k=f(x) J iffZclevx/t The above optimality conditions are "naive" in the sense that they yield no constructive solution to the mathematical program and, in fact, they represent little more than a parody of the definition for global optimality. It is well known in the non-linear programming literature that necessary conditions for local optimality can be obtained from the Kuhn-Tucker theory; furthermore, under suitable convexity assumptions for/ and X, these conditions turn out to be sufficient, so that the question of optimality is fully answered. In the absence of convexity, the situation is quite different however: sufficiency is no longer guaranteed. Moreover there frequently exists an inordinate number of local optima so that explicit search is impractical. But the situation is worse yet (as shown in the illustrative example (Figure la) below) : in addition to numerous local optima, there also exists a myriad of "useless" Kuhn-Tucker points of the saddle type, which are not locally optimal! Thus global optimization by means of successive inspection of K-T points tends to be very inefficient. EXAMPLE: Consider the "concave programming" problem: M aximize /= x^ + ex subject to — l<Xj<l, 1=1, . . ., n where €< are small positive quantities -Vi. Optimal value: n-\-^u=k i Optimal Solution: Figure la tThe notations lev, afif, cl, epi, bd and conv denote the level set, affine hull, closure, epigraph, boundary set and convex hull respectively [8]. NON-CONVEX PROGRAMMING THEORY 49 Number of Kuhn-Tucker points: X) (^) 2"-'=9 Number of local minima (global) '■ 1^) 2°=1 (convex case) Number of local maxima: ( ) 2^=4 Number of saddle points: £ (^) 2" "'=(?) 2=4 In this case there is a Kuhn-Tucker point on each face (of every dimension ^:<n) of the hyper- cube X; thus the Kuhn-Tucker theory requires one to check an enormous number of points, only very few of which are locally optimal ; but even local optima are too numerous (2") to be checked explicitly as soon as n becomes reasonably large (say >20). 3. INEQUALITIES 3.1. Generalities In the search for alternatives to the Kuhn-Tucker theory, one may attempt verifying (con- structively) the inclusion Xc levjf /. This can be conceptually accomplished in many different ways which eventually amount to "splitting" X into several subsets with subsequent verification of the inclusion for each subset. Cutting planes are usually associated with a repeated "shrinking process" of the feasible set, which may lead to convergence difficulties; alternately another approach is described below: it is finite and therefore remains free of convergence problems. In practice a large number of subsets is usually required; but the use of cutting planes (i.e., necessary conditions) may in certain cases furnish some hope to efficiently curtail the search for global optimality. EXAMPLE: Let X be defined by the inequalities (3.1) g,{x)<0,k^M The Kuhn-Tucker conditions are known to contain a set of complementarity conditions which read: (3.2) X,^,=0 ^k^M This structure can now be used to set up a dichotomous arborescence which generates 2'»+i— 1 subproblems of X according to the two possibilities: (3.3) a) X,= 0, g,<0 b) ^,=0, k^M which both imply (3.2) and are mutually exclusive. This "facial" decomposition methodology [5] has been applied successfully to some classes of problems in integer programming and in general quadratic programming; it is particularly useful when X is polyhedral (i.e., g^ linear). Facial decomposition is a natural extension of the idea of Branch and Bound, as both methods become identical when X is a hypercube. 50 C. BURDET The use of cutting planes offers a possibility to discard certain subproblems from further explicit computations. Since the proposed method is similar to Branch and Bound algorithms, it possesses the same underlying structure of exhaustive search among candidates for the global optimum. Thus the process is intrinsically finite. For a polyhedral feasible set X, one obtains a tree structure which is completely determined by the combinatorial structure of X; each node is a program on a face of X (subprogram). Nodes are fathomed by a machinery of cutting planes designed to eliminate sub- programs (faces) known to contain no point with a better objective function value than the current best solution (see [5] for more details) . 3.2. Inequalities Generated From Subadditive Gauge Functions We now investigate a general framework for the construction of inequalities. DEFINITION 1 : A subadditive gauge* function t is defined to satisfy: SUBADDITIVITY: ' (3.4a) Tr{u) -{-ir(v) >Tr(u-\-v) , ^u, V GAUGE: (3.4b) 7r(X'u) = X7r('u), 4^X>0 (positively homogeneous of degree one) LEMMA 1 : For gauge functions, subadditivity and convexity are equivalent. PROOF: From convexity: hence for one obtains i.e. subadditivity Conversely : 7r(au+(l — a)v) <air{u) -{- {I — a)ir(v) 1 aTiu) + (l-a)Tr{v)=Tr{au)+Tr((l-a)v)>ir{aU-\-(l-a)v) Q.E.D. For simplicity we now assume that every point xeX can be represented by reference to a given finite system of vectors {ejeR"} with respect to xe Aff XcR" i.e. a;=x+w=z-|-X) ^A- Assume the "coordinates" tj to be non-negative, i.e. the system {et] spans a cone at x which contains X. *Rockafellar [8] introduced this gauge function terminology in the context of Minkowski functionals charac- terizing a convex body; the present concept is closely related and we use the same term to characterize a broader class of functions. NON-CONVEX PROGRAMMING THEORY DEFINITION 2: An inequality j(N 51 xr\ \x=xit) 2_i TTj tj<C.Tro is called k-valid if the set (3.5) contains no point x such that/(a;) >k. THEOREM 1 : Let tt be a subadditive gauge: {Aff (X) —x}—*R; then the inequality is Ar-vahd, (3.6a) (3.6b) 2^ 'fjtj^'To jtN with Trj=ir{ej), iro=min Tr{u) where u=^^ejtj=x—x, ^xeX xtX PROOF: By induction on n. Since ir is a gauge and i^^O, ^jtN, one has: 7r;,^;, = 7r(e;,)i;, = 7r(e^,<y,) > IT ('^ejtj\=w{u) ri:ejt,\ Summing all inequalities, one obtains: y^.irdi>ir(u). J Now ir{u) >To, ^xeX with j{x) >k, by definition. Q.E.D. REMARK: Since one usually does not know in advance the optimal value k, one has to construct Z:-valid inequalities with k-^k, where /: is a known (or estimated) lower bound for the maximum k. EXAMPLE: Let X be characterized by the linear system Xi='^aijtj<Xi, ieM tj^O as customary in linear programming Then one may define as in (3.6a) : 7r(e_,)=7r(ay), with e,=aj=(ai^, . . ., a<;, . . ., a„j) .. 7ro=min Tr{u) u=x—x f(i)>k X(X 52 C. BURDET For instance, choose (see [3]) ir=^Piiu,)\Ui\ uM with parameters [p,+, if Wi>0 \—p-,\iut<Q where pi'^ > pf for convexity of tt (hence subadditity since ir is clearly a gauge) i.e. (3.7a) iri=XlPt(ao)l««l (3.7b) ir„=min X1p<('?^<)I'W(| /(i)>t This introductory example exhibits several features which are characteristic of the approach developed here : a) The subadditive gauge tt is an auxiliary tool which can be chosen independently of the problem (i.e./ and X). Of course there may exist some practical advantages in constructing "tailor made" gauges which reflect the structure of/ and/or X. Indeed the program (3.6b) may be very difficult for arbitrary functions tt. Most of the remaining developments of this paper are dealing with the question of exploiting any information contained in/ and/or X to construct "appropriate" gauges in that respect. b) The coefficients ir, of the inequality may be >0, =0 or <0; thus the theory is able to gen- erate any type of valid inequality. c) The amount of computations required for the construction of the inequality lies in the determination of ttj and ito; the above example shows, however, that the computations for vj are of a different nature than for tto. Since there are n coefiicients -Kj, one will typically choose tt func- tions where -Kj is easily obtained (as in the example) ; the major part of the computational effort then consists in the determination of the coefficient tt^,. Note that a lower bound 7r(,<7ro is sufficient to yield a k-valid inequality. 3.3. Subadditive Gauges Directly Related to f Consider the "concave programming" problem Maximize j{x) Subject to x^Zcii^" where/ is quasi-cont^ex. Assume that a value k (<k) and a point xe Aff X with. f{x)<k are known. We now construct a gauge function ir based on the convex set (lev j^f) — x: (see Figure lb) (3.8a) a) TT is a convex gauge, in particular 7r(0)=0 (3.8b) b) ^Xt^x such that/(a;)=fc, let u=x—x, NON-CONVEX PROGRAMMmG THEORY 53 TT-O Ji+.2 -(J) Figure lb. — Construction of an /related gauge (see (3.8)). The case (3.8a) applies to the point x. The case (3.8b) applies to the top half of the illustration, including the line 2 which is tangential to the set X at x- The case (3.8c) applies to the lower half, including the line 1. One has 7r<7r„ within the entire set lev* /; along the boundary of levt /, where /(x) = A, one has 7r = jro (curved part) and ■k<-Wo (linear part). and set (3.8c) 7^(^i): I iTo/X* otherwise, where X*= max {/ (Xa;+(1 — X)x)<A;}, i.e. X*>1. 0A<+<» Note that when/(x)<^A:, and levt/ is a bounded set the condition b) reduces to: ■7r(u) =1^0, -Vx such that/(a;) =k; indeed convexity of lev^:/ implies that X=l is the unique solution of /(Xx+(l-X)J)=^. Choose a system {e^} at x such that ■VxeX: x=x+^ ^jijt ^j>0 J we now have : LEMMA 2: (3.8d) ir(u)<7r„, ¥w€[(levt/)— J] i.e. \ev^,ir=[{\eYj) —x] PROOF: Take x with/(a;)>/:. Since tt is a gauge, one has on the ray Xu, with u=x — x and Xe[0, 1]: T(\u) = \ir{u). M C. BURDET But by hypothesis y(x)<Ar; hence for some Xe[0, 1), one has/(Xit+x)=^: a) if X = is the only such value, then 7r(Xu) = 7r(tt) = + «> >■"■„, 4^X>0 (from 3.8b); and Xibd Iev^„7r. b) Otherwise for all X wiihfi\u+x)=k, one has \ir{u) <\*ir{u) = iro, from (3.8c); furthermore (3.8c), i.e. X*:<1 and/(a;)<A:, imply X*< 1. Hence one must have 7r(w)>7r. Q.E.D. LEMMA 3: Assume local optimality of x i.e. /(x+Xe^) </(x), 4^Xe[0, e) with e>0 sufficiently small. Then a) ir; = 7r(,/X/, with X_,*= max {f{\ej-\-x)=k\ o<x<+«> b) ir;<0 otherwise PROOF: a) by construction one has X^*7r; = ir(ejX;*) = 7r(,; since local optimality of a; implies that for some X>0, the case 7r=+ oo cannot occur. b) liJ{ej\j-\-x)< k, -RX^O>0, one may always set ir^=0, i.e. X^*= <» ; however non-negativity of the TT gauge is not assumed, nor implied by the convexity assumption (3.4). Q.E.D. THEOREM 2: Assume (3.8) to hold true and let x be locally optimal with/(x):<A:. Then UN is a k-valid inequality, where (3.9a) a) iTj=irJ\j*, \j*— max {f{x-\-\ej)=k} 0<X< + o> (3.9b) b) 7ry<0, for the other jeA^' where (3.9a) does not apply. PROOF: Follows from the Lemmas 2 and 3. Q.E.D. REMARK: Theorem 2 remains true if x is only assumed to satisfy /(x)</:. COROLLARY 2.1 (Tuy) : Assume /(x)< A:. The cut J€N is /:- valid, with (3.10) 0<T, = 7r„/X,., where A ^ X= max {/(a;+Xe;)<t}< + oo. 0<X< + co PROOF : We need verify the hypothesis (3.8) of Theorem 2 : (3.8b) is immediate from Lemma 3 which contains the construction (3.10) as a special case. We now show that the resulting gauge ir is convex and therefore satisfies (3.8a). ir is constructed with convex level sets (3.8d) and is therefore quasi-convex; it is also non-negative. Lemma 4 below completes the proof. Q.E.D. LEMMA 4: A non-negative quasi-convex gauge w is convex. PROOF: We show that the set epi 7r={(7r, «)|7r>ir(u), "tb^R"} is a convex cone in i?"+'. Since tt is a gauge, one has ■VueR'' with 0<'n-(w)<+ oo ; Tr{u) = Tr{nv) = tJLiriv) = n where u=nv, and iLt>0 with t^ebd lev, tt, i.e. ir(2;) = l Thus all (strictly) positive values of tt are completely determined by the set leviir. NON-CONVEX PROGRAMMING THEORY 55 Since levi ir is convex, by the quasi-convexity assumption on tt, there exists (at least) one hyperplane HcU=l which supports levi ir at the point v, i.e. HtU>l, ^Vue levi ir and H„v=\. Since the gauge ir is non-negative the set epi tt can be represented as an intersection of the following convex sets. epi7r= n \{h,v)\h>xatiX {0,H^u],v,tR''\ » < bd ievi IT Q.E.D. 3.4. Dominance As indicated by the above Corollary, one sees that Tuy's theory is generalized in the present framework : (i) For a given level set lev*/, one constructs in a straightforward manner the subadditive gauge which corresponds to Tuy's cut (see Corollary 2.1). (ii) Further A:-valid inequalities can be obtained, however, when lev* / contains unbounded rays {tjej-^x), jtN: Any convex gauge w which agrees identically with ir on the set Rj of intersected directions Ri= {ue [AS X—x]\fi\u-\-x)y'k, for some X>1 ) will yield such an inequality. EXAMPLE (See Figure 2):j={xy)-\ k=l/2; ir<,= l, x={2, 1) i.e. x=u+2, y=v-\-l. A A A A) u<0 or v<0: in the construction (3.8b) we must find X from [(Xw+2)(Xw+l)]~'=l/2 A A i.e., \^uv-\-\{u-\-2v) = and thus A^ -(u+2t>) uv Figure 2a 56 C. BURDET Figure 2b. — Illustration of the construction of Tuy's cut. One intersection point is R, the other at infinity, because the gauge tt vanishes in u, v, >0. The point R can be determined geometrically by intersecting the ray r with the curve. Now in the region one has 7r=X '; 7r=+ oo otherwise. Thus: -u+y>0 -uv {u+2v) H for u>0, v<0 and - u-{-v>0 ioru<0, v>0,lu+v>0 X (w, tj) = + 00 f or - ^i+^;<0 7r(0, 0)=0 B) i/.>0, tJ>0 (Tuy) ; 7r(u, «;)=0. But other gauges can be obtained as follows: d7r_ —V uv _j_ —v^ dM~u+2y"'" {u-\-2vy~ {u^2vY dv -u 2uv -W for and for dv (u-\-2vy {u+2vy {u+2vy' dir —v^ 1 ^=^^§^^#=-2''^^' ^ dx —u^ . „ ov u^ NON-CONVEX PROGRAMMING THEORY 57 Figure 2c. — Cut 1 differs from Tuy's cut in that it now passes through the points S and R; in Tuy's cut the point <S was at 00 , while here the function . defines S. The polaroid gauge construction (4.8) corresponds geometrically to the envelope, within the non- negative quadrant u, v>0, of all tangent hyperplanes to the set epi / along the u and v axes. In the re- maining region of R^ with M+2y>0, the polaroid gauge ir remains identical to < A — uv u+2v Thus, in the non-negative quadrant, one can choose any value satisfying: 0>ir{u,v)> max — ^u, —v\- The analytical derivation of this construction is based on polaroid extensions (See Section 4.2). One easily verifies that the functions r defined by A) and B) are all convex gauges with cl (levj 7r) = (levi/— x). Note that there are, in general, many possibilities to define suitable gauges in the region u>0, v>0; all generalize Tuy's cut; therefore it is natural to investigate the "best" such generalization. THEOREM 3 : (dominance) Assume ira>0. The ^-valid inequality ^v,tj>iro, where tj>0 j dominates uniformly any inequality iff aj>'Kj, -Vj. 58 C. BURDET Figure 2d. — Here the set lev^, tt is delimited by the curve. /=i ("above" P) the tangent line segment P QT and the half-line through Tx. One sees that ir=—v has disappeared and one now has uv -(648u + v) 7r=— - U, ir — —s-' T = 2 u-\-2v and 7r= + <» in the four sectors defined in the illustration. 1075 PROOF: Since tj>Q, one has ajtj>irjtj; hence Q.E.D. COROLLARY 3.1.: The inequality dominates Tuy's cut uniformly. PROOF: Immediate from (3.10) The "best" inequality can now be defined as one which dominates all others, i.e. 7r(u)=inf ir(w), where ir satisfies (3.8). iF is a convex gauge because it is defined by pointwise inf in a class of convex gauges. In the above example w is easily seen to correspond to the choice for w>0, v>Q. k{u, w)=max \—n ^> ~y NON-CONVEX PROGRAMMING THEORY 59 Figure 2e. — Illustration of the most stringent inequality: cut 3 is vertical through P. We shall see in Section 4.1 that the gauge iF is the maximal convex extension of tt with respect to the set S=Rj. (See Remark i?l in Section 4.1 and cut 1 in Figure 2). 3.5. Subadditive Gauges Based on f and X In his note [6] Tuy noticed that one could improve his cut by a conceptually simple argument. He characterizes a convex extension F of J with respect to -X" by: (i) F(x)<f(x) ^x (3.11) (ii) F(x)=f(x) A^x^X (iii) F convex This leads to the following "improvement" of the optimality condition (2.1): Xcz levj; /«e»Zc lev^ F, where the improvement stems from the property levj /clevj F. Further results in this direction can be obtained within a subadditive framework: (3.12a) a) TT is a convex gauge with 7r(0)=0. (3.12b) b) as (3.8b) where only those x which satisfy xeX are considered. REMARK: (3.12b) is thus a relaxation of (3.8b) and one has (3.12c) (cl lev., tt) n X= [(lev, /) -5] X; 60 C. BURDET this r also generates a ^-valid inequality; furthermore, (3.13) (cl lev.^x)=3[(lev,/)-x] and from (3.8d), one sees that the inequality based on (3.12) will uniformly dominate the in- equality based on (3.8). This result also dominates Tuy's extension (3.11): indeed (3.12) is based on the convexity of the level sets and therefore corresponds to a quasi-convex extension rather than the convex extension (3.11) ; moreover we do not require f to be convex (see also [2]). 3.6. The "Best" Subadditive Gauges Finally one may bring an ultimate improvement in the definition of the gauge t which will yield (at least conceptually) Ar-valid inequalities which dominate uniformly all the others. Define the set S.=={x=x{t)eX\^ir,tj<x„] J and impose the following conditions on tt: (3.14a) a) tt is a convex gauge, with 7r(0) = (3.14b) b) as (3.8b) where only those x which satisfy xeS-^ are considered. REMARK: (3.14) is a relaxation of (3.12) because S^czX. LEMMA 5: The gauge ir defined by (3.14) generates a A:-valid inequality. PROOF: Take x^X with/(x)>>^. Suppose x^S,r, then since x=x-\-u=x+y^,edu 3 one has 7ro<7r('U) =ir(^ejtj) < ^Tfjtj i i from (3.14b); but for all x^S^, one has by construction ^_jXj^j<ir(,; hence x^S^r ' Q.E.D. THEOREM 4 : For every Ar-valid inequality l>;^^>ao(>0), J there exists a convex gauge t satisfying (3.14) and TT = 0^0 PROOF: Define Sa={x^X\jy,jtj<ao}; by hypothesis Sa contains no point x with /(a;)>A:. Setting iro=a<, and ir(ej) = aj (with ir(0) = 0) one defines a gauge tt, which is a hyperplane, and which satisfies (3.14) ; in general, there may exist a way to modify the above hyperplanar gauge into a convex gauge t, so that one may have the more general relation Trj<aj, -Vj^N Q.E.D. NON-CONVEX PROGRAMMING THEORY 61 REMARK : The above theorem indicates that every supporting hyperplane of the convex hull conv { X ^ X\j{x) > k ] may be generated by a subadditive gauge tt. Thus, in principle, the subadditive gauge approach can produce the most stringent /: -valid inequalities. 3.7. Example We now illustrate the construction of /:-valid inequalities as described in the Sections 3.4, 3.5, and 3.6. Let j{x, y) = {xy)-^ and X:4:X-\-y>22/3, x, y>0 x+y>S x—y<l (see Figure 2). 3.7.1. "Tuy" and "Cut 1," Choose x= (2, 1) and define the gauge tt as in the example of Section 3.4. One obtains the following coefficients: 7ri=7r(-l, l) = (X)-i = l 7r2=ir(l, 1)= max \--> — 1 =-- For the (A:=-j— valid inequalities, one has (7r<,= l): Tuy:<i>l,i.e. -^x+y>l Gauge tt: U—^ t2>l i.e. x—y>l ■K is the maximal convex extension with respect to S^ (see end of Section 3.4 and Section 4.2). 3.7,2. "Cut 2." The above cuts are based solely on levi/j/; the following gauge now takes the set -X" into consideration (3.12): I) in the halfspace - w+w>0: it lA) u<(i,Zu+v>0:Tr{u,v)^ ~^^ {u+2v) ir(0, 0)=0 IB) (tt<0 and Zu-\-v<0) or u>0: ir{u, v)^^^ (648w+y) 1U75 62 C. BURDET II) ioT-u-\-v<C,0:ir(u,v) = -\-co Thus one has 647 TTi — 1075 7r2= 1 2 3.7.3. "Cut 3." Finally we also show (cut 3) the inequality corresponding to the convex hull de- scribed in the final remark of section 3.6. 4. GAUGE EXTENSIONS ' ' In this section, we apply the general methodology developed in [2] to characterize some/ and X related gauge functions (see Section 3.5). These gauges are described as (maximal) convex extension with respect to the set X of the/ related gauge functions of Section 3.4. It is also shown that the latter are (maximal) convex extensions of the Tuy-type gauge func- tions given in Corollary 2.1. Thus the concept of (maximal) convex extensions of a convex function with respect to a given set appears here as a fundamental tool. We then derive a result indicating that the maximal convex gavge extension corresponds in fact to a convex gauge based on the maximal quasi-convex extension oj /; it therefore dominates all previously known results of this type, in particular those due to Tuy [6]. Finally we introduce the notion of polaroid gauge junctions (see also [3]). It is shown that, when the level set gauge is differentiate (-t^uj^O), the polaroid approach offers an analytical method to determine the maximal convex gauge extension. 4.1. Convex Gauge Extensions Consider the convex gauge x, defined on R^; and let S be a subset of 5", with non-empty interior (Int 5'?^0); Since tt is a convex gauge, the set epi ir is convex and it can be represented by the intersection of all halfspaces H which support epi tt: H{y)={{h, u)\h>HyU, ueR"} with h=Hyy—T(y) and ir{u) can now be represented by (4.1) Tr{u)=su-p {Hyu}. ytR" The maximal convex gauge extension U of tt with respect to S is constructed by relaxing (4.1) where only certain points y in the sup are selected : DEFINITION 3: (4.2) U{u)= sup {Hyu] ye Int S PROPERTIES: Pi) One has ir('u) >n(u), -Vu which implies dominance of the inequalities generated from TT by those stemming from 11. P2) n is convex NON-CONVEX PROGRAMMING THEORY 63 (4.3) P3) n is a gauge (4.4) Prooj: n(Xw)= sup {i/yXM)=X sup {iy,ui=xn(w), forX>0 Q.E.D. yt Int S yt Int S P4) levn. n = lev^, MQCir3lev,, tt, for Tio — iro>Q where MQCir denotes the maximal quasi-convex extension of tt with respect to the set S (see [2] for further details on M()C functions) . PROOF: We need show that the set lev,„n is the maximal convex extension (with respect to S) of the set lev^.ir. For each y^ Int S, one has by construction: U.{y) = Tr(y) and therefore also n(X?/) = Tr(X2/), -P^X>0. Now if TTo is such that lev,. IT f\ Int aS=0, we simply have 11 = 0, by definition. Otherwise consider the collection of halfspaces G{y)=\ueR-\H,u>Hy) = {H,y)}. One has /^'~\ lev,.n=cl f 1 G{y), !/«(Int Sn lev,. I for iro>0 which is (by definition) the maximal convex extension of the set lev,„ tt with respect to iS" (see [2], Theorem 2). Q.E.D. P5) By construction of the level set gauge tt (3.8), one has lev,. ir=[{\eY^ f)—x], where 1c'>j{x), XiS; thus one also has lev,. MQCTr=[(leVft MQCf)—x]; since both MQC functions are obtained from the maximal convex extension of the same sets; Thus from the above property P4, we observe that the convex gauge extension IT is the gauge belonging to MQCf, not/; it therefore brings about "better" optimality conditions for the program (2.1). REMARKS: There are two instances for which an extension process is called for in our approach : Rl) for the definition of the convex gauge n along rays where the function/ is unbounded {S=Rj) as explained at the end of Section 3.4. R2) for the construction of a gauge which takes the feasible region X into account S=X-x Thus one may naturally choose the following set S in Definition 3 : (4.5) S=[{xt-X\{x)>k}fx] in order to combine both aspects Rl and R2. 4.2. Polaroid Gauges We now use a generalized concept of polarization introduced in [1] to construct subadditive gauges. Consider a function ^(u; p) : R^XP^R and a set P; assume *(•; p) to be a convex gauge Vp^P and define the polaroid gauge (see [1, 3]) : ;4.6) n(u)= sup Hu;p) LEMMA 6: n(tt) is a convex gauge 64 C. BURDET PROOF (omitted) : See Lemma 7 and 8 of [3] COROLLARY 6.1.: The inequality is /:-valid with (4.7) Uo— min sup <I>(u; p) U€S peP EXAMPLE: Let ^(1*) be a level set gauge defined by (3.10) and assume v difFerentiable for all u^O with X (u)<C -j- CO . Define (4.8) ^(.u;p)=T{u) + ip-u)Vir(u) Then cnoose P=S<zR'^ to obtain for each u^S: n(w) =sup ^{u ; p) =<J>(-u; u) — tt (u) pep Thus n„=min7r («)=#(,, utS the same value as in Corollary 2.1. Note that Uiu) is in general different from t(u) for u^S. Furthermore, one can show (see [2], Lemma 12) that the gauge IT constructed here is, in fact, the maximal convex extension of ir with respect to S. Note that this extension is obtained from a polarization of the level set gauge v, not of the objective function/. Although one has, by definition, lev^, 7r=[(lev;t /) — x,] the polaroid gauges derive from tt and / are quite different in general. REMARKS: 1) If ir{u) is not differentiable -Vv.^^O, the gradient V7r(w) can be replaced by the subdifferential dir{u) in (4.8) since -k is convex: i.e. (4.9) ^{u;p) = t{u)-\- sup {p—u)^ir Vir«J)ir(U) 2) In principle, the determination of the coefficients n^, and n<, are obtained from an auxiliary program i.e. (4.10a) nj=n(ej) = sup$(e^;p) pel' j (4.10b) ^<,= minsup'l>/]S^;^^;2'^ uiS p(P \ j J with u='^ejtj j f NON-CONVEX PROGRAMMING THEORY 65 The nature and degree of difficulty of these programs (4.10) largely depend upon the set P ind the form of the function * Since there are n coefficients 11; one will try to keep (4.10a) as simple as possible. EXAMPLES. In the following two examples, the program (4.10a) reduces to the explicit determination of the largest number in a finite list: 1) P finite = {pu P2, • • •, Pp} .e. $(u;p) = $p(w),:P^P [ 2) P may also be a list of finite index sets P={A, A, . . .Pr] dth the following convex gauges ^rtT: ^r{u;p),P^Pr ' or instance PtPr nd the corresponding polaroid gauge becomes : n(t() = max y^, #„ iu) . TtT P(P, We may now analyse Example 3.7 and Figure 2. Starting from Tuy's cut (and the corresponding ivel set gauge which has Tr{u, v) = 0,-¥u, v>0 lid le cut 1 is obtained by polarization (4.8) with Si={{u, v)\u<0, or v<0}; this corresponds to mark Rl, the end comments of section 4.1, and to the end comments of section 3.4. For the corresponding polaroid gauge one has : Uiu) = + 00 in the halfspace u-\-2v<i0 ; otherwise the halfspace w+2u>0: Uiu, i;)=sup {tx+2v)[-{2v'u-\-^'v)]-- H<0 or o<0 - — rFT^' for u<0 or v<0 [u-\-2v) ~ ~ max — -r u, — u for u, u>0 milarly for the cut 2, with 66 C. BTJRDET BIBLIOGRAPHY [1] Burdet, C. A., "Polaroids: A New Tool in Non-Convex and in Integer Programming," Naval Research Logistic Quarterly, 20, 13-24 (1973). [2] Burdet, C. A., "Convex and Polaroid Extensions," WP73-21,Faculty of Management Sciences, University of Ottawa (1973). [3] Burdet, C. A., "On the Algebra and Geometry of Cuts," WP 74-8, Faculty of Management Sciences, University of Ottawa (1974). [4] Burdet, C. A., "On Polaroid Intersections," Mathematical Programming in Theory and Prac- tice, P. Hammer and G. Zoutendijk, eds., pp. 365-387 (North Holland, 1974). [5] Burdet, C. A., "The Facial Decomposition Method," Operations Research Quarterly, 24, 459-463 (1973). [6] Tuy, Hoang, "Concave Programming Under Linear Constraints," (Russian) Doklady Aka- demii Nauk SSSR, 1964. English translation in Soviet Mathematics, 1437-1440 (1964). [7] Johnson, E. L., "A Group Problem for Mixed Integer Programs," to appear in Mathematicalj Programming (1975). [8] Rockafellar, R. T., "Convex Analysis," (Princeton University Press, 1970). CONVEX AND POLAROID EXTENSIONS Claude-Alain Burdet SYSTEMATHICA Consulting Group Ltd. Pittsburgh, Pennsylvania ABSTRACT In an effort towards a comprehensive and unified theorj% this note presents some new results in the area of non-convex programming within the framework of convex (sets and function) anal3'sis. The entire study is primarily devoted to the development of useful tools for extreme point programs (such as concave or integer programs) . 0. BACKGROUND This paper presents several new ideas in convex analysis to come to grips with non-convex programming problems. It lies in the mainstream of developments in the theory of polar sets [7], generalized polars [1] and polaroids [2] ; although some results heavily rely on concepts introduced in [2 and 3] the paper is self-contained. The line of research pursued here is related to the following general constrained optimization problem : Consider the non-linear program maximize /(a;), subject to a; ^^c: R" 3ne has immediately the following paraphrase of the global optimality property which may be nterpreted as a necessary and sufficient optimality condition IS globally optimal] -^ The object of this note is to establish weaker sufficiency conditions for the case where both JVjf/ and X are convex. In his short note [5], Tuy proposed some interesting ideas which can be applied to this kind )f problem, particularly when X is polyhedral. This paper is an attempt to produce more powerful esults (both theoretically and practically) and to provide some elements of mathematical structure a this area. In a first section we introduce the concept of convex extension of a set with respect to another, t includes a pointwise definition, a study of the question of maximality, and an equivalent defini- ion by means of hyperplanes. 67 68 C. BURDET This concept is applied to define convex and quasi-convex extensions of a given convex Junction j with respect to a (convex) set (Sections 2 and 3) ; we also construct a maximal convex extension oj j, which is shown to have the same properties as the extension described geometrically by Tuy [5]; furthermore, a dominance theorem is established. Section 4 briefly reviews an application of these extensions to the non-convex programming problem area. The next section features another type of extension called -polaroid extension which possesses the advantage (over the previous two types of extension) of being more computationally tractable. ; Section 6 handles the special case where / is assumed diff erentiable ; an analytical definition of the maximal convex extension of / is presented ; it is also imbedded in a family of polaroid ex- tensions. The end of this section is devoted to a brief description of the quadratic case [1], where the general results derived in the previous sections assume a special form; parallel studies [1, 4 and 8] develop the corresponding algorithmic implementations. We then examine in greater depth the application of the present results to non-convex pro- grams; it is shown how a valid cut can be constructed from an extension. In the second part of Section 7 a new type of cut based on a cutting polaroid is established: it combines cutting planes and polaroids to produce a "deep" cut where the two theories interact. Finally a last section indicates some further improvements which can be obtained when the feasible set is polyhedral. To conclude this introductory tour it should be pointed out that little effort is made here to investigate every aspect of possible interest in the theory of polaroids. We have tried instead to adopt a line of development which seems most promising as far as implementation of new ideas is concerned. This has been done with the double intent of obtaining early results of tangible interest and of attracting others to the development of an open area. 1. SET DEFINITION AND PROPERTIES Consider a closed convex set C in IR" and a convex set D with non-empty interior (i.e R»=afrZ)). DEFINITION 1 : A convex extension E oj C with respect to D is defined to satisfy: a) E is convex b) Ez^C c) {En Int D)ci{C[] IntD) Note: b) and c) imply (^Jfl Int Z>) = (Cn Int D). LEMMA 1: Eis closed. PROOF: We show that the complement of E is open; take Xi^E, then there exists x^^E such that a;3==Xxi + (l-X)x2, X^(0, 1] satisfies ^3 ^ IntD but X3 ^ C. C is closed, hence there exists an open ball Bixs) such that B(x3)c:DhntB{x3)r[C=0. If* CONVEX AND POLAROID EXTENSIONS 69 A Since Xy^O, we may define the following open neighborhood of Xj U{x,)={x=\-'[y-(l-X)x2]\yeB(x3)}; and ^x^U{xi) there exists y^ Int /> with y^Chence x^E, i.e., U{xi)r[E=0. Q.E.D. The following example shows that E need not be unique EXAMPLE 1: DEFINITION 2 : A convex extension E is said maximal if EzdE' for any convex extension E' y{ C with respect to D). LEMMA 2 : A convex extension E is maximal iff it contains all points Xi such that 4^X2 ^ C, one asxs^ Int Z7=^X3^ (7, with X3=Xxi + (l — X)x2, X^[0, 1]. PROOF: From definition 1, one sees that every point Xi of a convex extension satisfies the )ove; if a convex extension E contains all such points, it contains every extension and is therefore, aximal. Conversely, we show that there always exists a convex extension which contains Xi; it then llows from the maximality of E that Xi ^ E. Construct the set V{x{)={x\x=\x+{\-\)y, \^[Q,l],y^C]. early V is convex; from the hypothesis one also has y(xi)n intz)=(7n imD. q.e.d. Example 1 shows that the set of all points Xi (Lemma 2) need not be convex (and therefore, t an extension E according to definition 1). THEOREM 1 (existence and uniqueness) : If {Cr\ Int D) ?^0, there exists exactly one maximal bension E. PROOF (existence) : The set Q of points Xi satisfying the conditions of Lemma 2 satisfies b) d c) of Definition 1 ; it remains to show convexity. I Let x^3 = Xx,»+(l-X)Xl^ X^ [0, 1] where x/, Xi^ ^Q. First we note that if the (convex) segment Xi'xi^( = S) intersects C then x,^^Q because Xi' 70 C. BURDET selves to the case where the segment S does not intersect C. Construct the set V{xi^) and V{xi^) (seeLemma2);define J^6c^0with J=Ma;i^+(l-M)2/, m€(0, 1) for anyi/^Cn IntZ);itis sufficient to show y ^ Int D, implying xi^ ^ Q. One has y^ cl [V{x,')-C]n el [V(x,')-C], because SnO=&; but [V{x,*)-C]r[ Int Z)=-0, hence also cl [V(xi')-C] fl Int D=0 and therefore, ^^ Int D. (Uniqueness) : For any two maximal extensions E, E' one must have EczE' and E'cE' hence E=E' . Q.E.D. Theorem 1 provides a pointwise definition of the maximal convex extension of C with respect to D. We now turn to an equivalent definition, by means of halfspaces. THEOREM 2: The maximal convex extension E oi C with respect to D is the closed intersec- tion 4^y G (CTl Int Z>) ?^0 of all halfspaces H{y) = [x\hyX<hQ] corresponding to hyperplanes {hyX=ho which support C at y (i.e., ko = h^y and hyX<ho, -Vx^ C) : E=C\ n H{y) ygCn IntD PROOF: E clearly contains C and is convex; furthermore, {E{] IntZ)) = (Cn IntZ)). Hence E\^ & convex extension. We next show that E contains every point Xi (see Lemma 2) ; take any Xi, i.e., 4^X2 ^ Cone has X3 ^ Int D=>X3 ^ C. Suppose there exists a supporting hyperplane at x'2 ^ CD Int D separating x, from C; then there exists no x'3= Xxi+ (1 — X)x'2, X ^ [0, 1] with x'3 ^ C; however, there is an op8n ball B{x'2)cD so that x'3^5(x'2); this contradicts the property of Xi stipulated in Lemma 2. Thus, there is no such separating support and Xi^E. Maximality of the set of points Xi (Lemma 2) now implies the reverse inclusion; hence both definitions characterize the same set E. Q.E.D. Note that we have to define E as the closure of this intersection of halfspaces because, in general, this intersection may not be closed as shown in Example 2. EXAMPLE 2 CONVEX AND POLAROID EXTENSIONS 71 It may also be useful to remark that this concept of extension of a set with respect to another can be generalized to non-convex sets. 2. CONVEX EXTENSIONS Consider now the real-valued function /; let X be a closed convex subset of R" and assume Xci dom/ and Int X9^9. LEMMA 3: Consider a convex set Ac(IR"X R) and define j{x)= inf (r, +oo) (I, r)fA then / is convex. PROOF: Take x=Xa;i + (l-X)x2. We show/(x)<X/(a;,) + (l-X)/(x2), X^[0, 1]. By convexity of A one has : /(z)= inf r<Xri-|-(l— X)r2 for any n, r^ (I, r)«A such that (xi, rO and (x2, r^ ^ A; hence, also for/(xi)=inf r^ and/(x2)=inf r-t. Q.E.D. The converse of Lemma 3 is, of course, also true i.e., the epigraph A=epi/={(x, r)|x^X, r^ R, r>/(x)} is convex. Define the (convex) cylinder D over the convex set X Z?={(x, b)\x^X, 5^R)c=R«+i DEFINITION 3 : A convex extension CF of the function / with respect to the (closed convex) set X is a function whose epigraph G is the convex extension of the set epi / with respect to D i.e., CF{x)=Tmn {+ <o , ^\{x, ^)^G]. In mathematical programming, we are really only interested in the values of / over the feasible set X; the next Lemma shows that/ and CF are "essentially" identical on X, i.e., that one may replace / by its extension CF. LEMMA 4 : -J^x ^ Int X, one has CF{x) =/(x). PROOF: For x^X one has (x, <p) ^D, ^<p^ R; furthermore, from Definition 1 (Int D n epi/) = (Int D n epi CF) holds true. Hence, -J^x^Int X, CF(x)=min {+00, ^\{x, v)^ epi CF]=wln {+», ^|(x, <p)^ epi/}=/(x). Q.E.D. We now examine the discrepancies between/ and CF, i.e., the special cases where the two are not rigorously identical. LEMMA 5: The convex extension of a convex function on X is a continuous convex function. PROOF: Convexity follows from convexity of epi CF and Lemma 3. Furthermore, the set epi f^CFis closed (Lemma 1) implying continuity of the min function. Q.E.D. Note that Lemma 5 does not require continuity of/; and, if / is discontinuous, one may have j{x)9^CF{x) for some x^ bd Z. 72 C. BURDBT COROLLARY L5.1: If/is continuous on X then/(x) = CF(a;), ^x^X. PROOF : Take a; ^ bd J?; continuity of CF and/, with/(x) = CF{x) , ^x ^ Int X implies equality on the boundary also. Q.E.D. Maximality was introduced in Section 1 because, as we shall see below, it introduces in a natural way the most "desirable" extensions, i.e., which dominate the others and yield better suffi- cient conditions. If one considers the maximal convex extension in the definition 3, then the resulting function MCF is called maximal convex extension of/; and one has: THEOREM 3: For any real-valued convex function/,/' with/'(a;)=/(x), 4^a;^Xone has fix) >MCF(x), 4^x G IR"- PROOF: Maximality of epi MCF with, respect to D implies epi/'c epi MCF. Q.E.D. Note that -Vx ^ R" one has MCF{x)<j{x) so that MCF can only become + oo when and where /has the same property. In the special case where/ is not continuous on X (discontinuities may only occur on bd X), the epigraph should be replaced by its closure: cl epi/. The following immediate corollary relates the present construction to the concept envisaged byTuyin[5]: COROLLARY T3.1: A convex function g{x) satisfies (i, ii, iii) below iff g{x)^MCF{x). (i) g{x)=J{x),Aix^X (ii) 9{^)<m (iii) (maximality): g{x)<f'(x) for any convex function satisfying (i), (ii). PROOF: From Theorem 3 we know that MCF satisfies the above conditions (i)-(iii) ; in par- ticular MCF(x)<g(x) furthermore, MCF is convex, hence from (iii) one must have g{x)<MCF{X). Q.E.D. Thus, one has: The maximal convex entension of the function/ with respect to X is unique. 3. QUASI-CONVEX EXTENSIONS In many ways, and particularly in the context of the optimality condition described in Sec- tion 0, convex extensions appear to be too restrictive; indeed only quasi-convexity is really required, and we now turn to the definition of the corresponding generalizations. DEFINITION 4:/ is a quasi-convex function if and only if its level sets lev^/= {a:|/(x)<^} are convex. DEFINITION 5: A quasi-convex extension QCF of the function / with respect to the (convex) set X Is a function whose level set L^ is a convex extension of lev^/ with respect to X; by definition one has QCF(x)= ini {<p, +oo\x^L^}. It may happen here that lev,,/ is not closed (when/ is discontinuous) ; one then uses cl lev^/ to define the extension. LEMMA Q-.A^x^ Int X, one has QCFix)=J(x). PROOF: By Definition 1, one has Int Xr\ levj=lnt Xf]L,, ^<P^j{X) thus, for x^ Int X one has/(x)=^<s»QC/?^(2;)=^. Q.E.D. ^ CONVEX AND POLAROID EXTENSIONS 73 LEMMA 7: The quasi-convex extension of a quasi-convex function on X is a continuous quasi-convex function. PROOF: As for Lemma 5, this follows from convexity and closedness of the convex extension L^. Q.E.D Here again /is not required continuous and one may have /(a;) 9^QCF{x), for x^ h6. X (see remark after Lemma 5). If Z- is the maximal extension ^(p one obtains the maximal quasi-convex extension MQCF with the following property : THEOREM 4: For any real-valued quasi-convex functions/,/' with/'(x)=/(a-), A^x^X, one has/'(a;) >MQCF(x), ^x^R\ PROOF : By construction lev^ MQCF is maximal -Vip hence, lev^/'c lev^ MQCF must hold ^<p. Q.E.D. Clearly, since a convex extension is also quasi-convex, one may consider both extensions of a convex function; one then has the following uniform dominance theorem. COROLLARY T4.1 : lev^/clev^ MCFc^ lev^ MQCF, ¥<p i{x) >MCF{x) >MQCF{x), and equality holds throughout on Int X. An application of this main result is presented in the next section. In conclusion of this first part of the paper, we observe that quasi-convex extensions provide the proper theoretical foundations (rather than convex extensions) for the investigation of opti- mality conditions within the framework of convex analysis. It should also be pointed out that, in spite of its trivial derivation, the above dominance theorem is of great practical importance; the reader may also convince himself that the quasi-convex extension MQCF has a graph different from MCF. 4. (QUASI) CONCAVE PROGRAMMING Let us 'consider our initial non-linear program (see Ssction 0): maximiz3 /(x) subject to x^XclR", where X is assumed (closed) convex and/ (quasi-) convex. Thus, the inclusion test lev^/z)X required by the sufficiency conditions deals with two convex sets and we can use the Uniform Dominance Theorem T4.1 to obtain better optiraality conditions than the above. We use the word "better" to indicate that, for whatever method used to actually perform an inclusion test, it is reasonably clear that such a test is more conveniently performed on Qi':dX rather than on Q2^X when Q^ZiQi. The following theorem shows that sufficiency is indeed maintained. THEOREM 5: Assume continuity of/ on X; then one has Xc\ev,f<^X^ lev, MCF<s^Xc:\eY, MQCF and fix) = MCF(x) = MQCF{x) ,^x^X. PROOF: Immediate from Lemma 5 and 7. REMARK 1 : The possibility of improving the sufficiency inclusion test b}^ considering a convex extension has first been pointed out by Hoang Tuy in [5]; the equivalence of his description of a convex extension with ours is shown in Theorem 3. The introduction of a quasi-convex extension serves the double purpose of providing j^et a better condition when/ is convex and also a possibility to handle the quasi-convex case in a similar manner. 74 C. BURDET REMARK 2: Since the condition lev^/oX is necessary and sufficient for optimality of i^X (with f(x)=k), Theorem 5 describes a variety of necessary and sufficient optimality conditions; it should be noted however that while extensions have the property of improving (weakening) the sufficiency conditions, this has an adverse effect on the necessary aspect since one wants the strongest possible necessary condition. On the other hand, our assumptions on X and/ renders this "neces- sary" side of the question uninteresting because it has been thoroughly developed in the literature. 5. POLAROID EXTENSIONS The definitions of (quasi-) convex extensions do not lend themselves easily to a straightforward algorithmic implementation because they are geometric rather than analytical. One may also encounter practical difficulties when / is not continuous on X. We now introduce the concept of polaroid extensions to remedy this situation. DEFINITION 6 [2]: Given ^^ R, the polaroid set X*(k) corresponding to the set X with re- spect to the function <p: R"XlR"— ^R is defined by X*(k)=^{y\^(x;y)<k,A^x^X}czR\ DEFINITION 7 : The function <p is said to polarize f on X if -t^a; ^ X one has/(x) =<p{x; x). The polarization is said proper when tp satisfies <p(a;; y) <max {/(x), f(y)]. THEOREM 6: If -Vx^X, the function ^(y)=<p(x; y) is quasi-convex (in y) then the polaroid X*{k) is convex -Pfc. PROOF: Omitted (see [2, 3]). In the sequel, we shall always assume that <p satisfies the hypothesis of Theorem 6, as we only make use of convex polaroids. We can now apply the above concepts to our present objective of constructing a (quasi-) convex extension of/; to this effect we now consider the set {X{\ \evkf) and its polaroid P{k): LEMMA 8 : Under proper polarization, the polaroid P(k) = {Xr\ lev*/) *(k) is a convex extension of the set lev^^ / with respect to X. PROOF: a) convexity follows from Theorem 6 b) P(k):D \eY,J: onehas Pik)={y\<pix;y)<k, ^x^iXn lev,/)}; take y^ lev^/, 4^x^ {Xr\ lev*/) then max {/(x), J{y)]<k holds true. Hence, since the polariza- tion (p is proper, <p{x; 2/)V|max {/(x), j{y)}<k, i.e., y^P{k). c) Take z^X[\ lev*/, with/(2)=^; the set Q= {y\ip{z; y)<k} is convex since <p{z, y) is assumed quasi-convex in y ; furthermore, the following insertions hold QzDP{k) 3 lev* /3 (X D lev*/) ; moreover,/(2) =(p(z; z) =k implies z^hd Q and <; ^ bd lev,/. Since QzDlev^f and both sets are convex, there must exist a hyperplane through z, supporting both Q and lev* /; furthermore one also has ^ bd P{k). Thus, since P{k)c:Q, the set P{k) has a common boundary with the set lev, / within the set X. Convexity implies P{k)r\X= lev, /(IX Q.E.D. COROLLARY L8. 1 : Consider the maximal extension Z, of lev,/ with respect to X; then one has lev, JciP{k)ciL„ and {P{k) n Int X) = (Z,n Int X) = (lev, /n Int X). CONVEX AND POLAROID EXTENSIONS 75 DEFINITION 8 : A polaroid extension PF of the function/ with respect to the set X is a func- tion whose level sets are lev^PF=P{k) i.e., PF(y)=mi{+^,k\y^P(k)}. Note that PF depends (implicitly) on the polarization (p oij and that there are, in general, several polaroid extensions of the same/. One may also observe that the definition of PF does not require/ to be convex; PF is always quasi-convex however. Furthermore one has the following alternate definition of PFiy) iory^ dom PF (i.e., Pi^(2/)< + °° ) : PF(y)= inf {k\,pix; y)<k, ^x^Xf] lev,/} = sup {k\<p(x; y)>k, some x^X}; hence PF(y)=max<p(x; y) and the above expression can be used to compute any value of PF. COROLLARY L8.2 : j{x) <PF{x) <MQCF{x), ^x and equality holds throughout -Px ^ Int X. PROOF: Follows from Corollary L8.L There exists no general dominance theorem between MCF and PF; however, one obviously has COROLLARY L8.3: If the polaroid extension Pi^is convex then PFix) >MCF(x) (and PF{x)^MCF{x)=j{x), ^x^ Int X). PROOF: By maximality of MCF. In conclusion, we note that polaroid extensions may constitute a viable alternative to the definitions of Sections 1 and 2, particularly when the non-linear programs I maximize (p{x; y), subject to x^X are conveniently solved for each parameter y. Since it is necessary to produce the global optimum of the above program, one will naturally restrict the choice of ip to functions of the following type: ^y, <p{x; y) pseudo-concave (in x); remember that we also assume 4^z, <f>{x; y) quasi-convex (in y), in order to obtain convex polaroids. LEMMA 9: Let (p satisfy the above conditions, and X be a convex set, then the program max <p{x; y) is convex ^y, i.e., every local optimum is a global optimum. PROOF: Omitted. It seems natural, at this point, to question whether the abstract definition of a polaroid ex- tension is really useful, and in particular if there are cases where a concrete polarization ^ can be explicitly given; the following sections will now show that indeed one can easily find polarizations of a given function / which are both strong (in the sense of the Dominance Theorem T4.1) and implementable. 76 C. BURDET 6. DIFI'ERENTIABLE f In this section we present some results related to Tuy's note [5]. First we give by means of a polaroid extension an analytical characterization of Tuj^'s convex extension; a family of polaroid extensions is then constructed which satisfies a dominance property (Theorem 8). The practical significance of this theory is brieflj^ illustrated with a quadratic example while a more comprehensive description of this special case can be found in [1] for the concave and [4] for the general case. Define the polarization ^{^, y)=J{x)-\-{y—^) A/(a;). Clearly <p{x, x)=j{x) and since <p is linear in y, Theorem 5 holds; moreover, LEMMA 10: If/ is quasi-convex, ^ is proper. PROOF: Quasi-con vexity of/ implies, in the case of differentiality: max {j{x) , fiy) } >j{x) -\-{y-x) v/(x) ; hQncQ <p{x; y) < max {j{x)J{y)]. Q.E.D. We now restrict ourselves to a convex function / and show that the corresponding polaroid extension PF is the maximal convex extension MCF. (Lemms 11 and 12). LEMMA 1 1 : PF is convex and PF(y) <j{y) when / is pseudo-convex. PROOF: We only need consider y^ dom PF; take y=\y'+{l-\)f PF{y)^ max ^{x; y)= max [X^(x; f) + {\-\)<p{x; 2/^)]=/(x)+[Xy' + (l-X)2/=^-x]v/(x) xtX X(X =X[/(x) + (?/'-5)V/(x)] + (l-X)[/(x)-f(?/='-x)v/(x)]<X max ^{x; f) + {.\-\) max ^{x; y') xiX =\PF(y') + {l-\)PFif). Also PF(y)= max (p(x; y)=j{x)-\-{y—x)vj{x)<j{y) by pseudo-convexity of/. XtX LEMMA 12: PF is maximal. PROOF: We must show that for any convex function ^( a;) ynth. g{x)=PF{x)=J{x) , ^x^X one has g{y)>PF{y),^^R\ Since g and / agree identically on X, g is subdifferentiable ; furthermore one has ^j{x)^dg{x),^x^X where dg{x) is the subdifferential of g at x. Now the subgradient inequality gives g{y)>g{x) + {y-x)u, ^u^dg{x), -¥i/GR"; in particular for u=vf(x). Now given y^ R", choose x^X such that PFiy) = max <p(x; y) =j(x) -\- (y-x)vf(x) ; xcX CONVEX AND POLAROID EXTENSIONS' 77 one has g{y) >g(x) + (2/-x)v/(i) =/(x) + (y-x)vJ{I)=PF(y) Q.E.D. It is interesting to note the following additional property of PF: Take y ^ ^"—X; assume x' ^X is such that <p(x' ; y) =f(x') + (y- x')vj{x') = max <p{x; y)=PF{y) . xtX Then PF(z)=J{x') + \(i/-x')vf(x'), A^z=Xy-{-(l-\)x', X^[0.1] because PF is convex and satisfies by construction PF(z) >^{x'; z) (since x' ^X) with equality holding at z=x' and z=y; i.e., the convex extension PF is linear on the segment :^'. 2/). COROLLARY L12.1 : PF=MCF . Note that convexity of/ is not essential here; however, if/ were not convex one could have PF{x)^j{x), for some x^X, since PF is convex. It should also be noted that PF is only maximal with respect to other convex extensions; if quasi-convex extensions are taken into consideration, maximality is, in general, lost, as shown by Theorem 8 below. The polaroid extension PF can be extended to the case where the function / is only required to be convex and continuous on X (i.e., only subdifferentiable instead of differentiable) ; this merely involves replacing the gradient by a subdiflferential. It is interesting to note that although PF is the maximal convex extension of a convex (differ- entiable) function/, it is really not the best extension which can be obtained by this type of polariza- tion; to illustrate this point, we now imbed PF in a family of polaroid extensions PF^ (with PFi = PF) <Pa{x; y)=j{,x) + a{y—x)VJ{x), q;>0. and prove the following dominance theorem. THEOREM 8: PF^^{x) >PF„^{x) if a,>a2. PROOF: We show that the corresponding polaroids satisfy Paj(t)czP«2(^)- Take yi ^ bd Pa^ik), and choose Xi^Xf] lev*/ such that <Paiixi;yi)=lc. Suppose 2/i ^Pa2ik) ; then there must exist a point 2:2 6^0 lev^/ such that <pa2ix2;yi)=k'yk. It follows that a2(yi — X2)vf(x2)=k'—f{x2)y'0 (sincej{x2)<k) and hence (yi — X2)Vfix2)>0. Furthermore, yi^Pa^{k) implies /(Xz) + ai (2/1 - ^2) V/ ( Xa) =/ (^2) + "2 (yi - 2:2) V/ ( X2) + (ai - az) (2/1 - X2) V/ ( X2) < /: 78 C. BURDET hence {a^ — a2){yi — X2)Vj{x2)<k — k'<_0 which implies (cci — 012X0. A contradiction the hypothesis ai >a2; thus, yi ^Paiik). Q.E.D. An Illustration : The Quadratic Case [1] Consider /(x)=cx+K xCx, where/ is convex, i.e., C positive semi-definite. The maximal convex extension of/ with respect to a feasible set X is defined by the polarization (see L12.1) <p(x; y)=cx+- xCx-\-cy-]-yCx—cx—xCx=—Jix)-j-[cx-\-cy-{-yCx] COROLLARY L12.2: The maximal convex extension of a quadratic (convex) function can be obtained by solving convex programs of the type max,p{x;y), where y is given and (p(x; y) is a concave function (of x). PROOF: Since/ is convex, — / is concave and ^= —/+ linear terms is also concave; hence, the program is convex. Q.E.D. But the familj^ of polarizations ipa{x; y) =J (x) -\- a{y— x)vf (x) , (a>0) obeys the dominance relation of Theorem 8. Moreover, for the quadratic (convex) case, observe that ^1/2(3;; y)=2 [cx+cy+yCx] is bilinear. Thus, in this case the convex programs of Corollary L12.2 are linear programs. Since one has <Paix; y) = — (2a—l)f{x) + a[cx^cy+yCx], it is apparent that for any given y the function <Pa{x; y) is convex in x when 0<a<l/2; thus, a polaroid extension of this type "better" than PFi,2 (in the sense of Theorem 8) can only be obtained at the cost of solving a (non-convex) concave quadratic program in Corollar}' L12.2; this is of course of little practical value since such programs are often of the same degree of difficulty as the original concave program: max/(x), subject to x^X. However, in certain cases, where the polyhedral set used to define the polaroid has a simple structure (for instance, the simplical set S of Section 7) this can be efficient (see [4]). J 7. POLAROIDS AND CUTTING PLANES As mentioned in Section 6, one of the applications of polaroid extensions is (quasi-) concave programming. We now indicate how cutting planes can be constructed from a given extension. In the second part of this section we present a new (interactive) construction of the "deepest" cut to be generated in the present framework. CONVEX AND POLAROID EXTENSIONS 79 7.1. Cutting Planes The inclusion test (see Section 0) of two convex sets can often be accomplished efficiently by a cutting plane algorithm, particularly when the feasible set X is polyhedral. (There exist other types of procedures, which are also based on convex extensions and can be used in this context (see, for instance [1])). Take the current best value k=f(x) where x^X is the corresponding solution; by hypothesis one has x^ lev]^/ and a cutting plane is constructed to discard a (cut-off) portion Sr\X which cannot contain a point y better than x, i.e., Sr\Xcz Int {\eYjJ), or f{y)<Ck, -Vy^SflX where the set S is used to denote a cut (see [2]). (Here/ should be assumed strictly quasi-convex and lower semi-continuous). Cutting planes are a convenient way to define such a cut-set S which is then a half-space ; thus, in essence cuts are used to reduce the feasible set X and thereby facilitate the inclusion test. When X is polyhedral, one can define a simplicial cut-set S with x a vertex of X. Let x be obtained from an L.P. tableau of the form where A'^ is the non-basic index set corresponding to x. The cut-simplex S can be determined as the convex hull of the (n-f 1) points (see [3, 5]) [x, Uj,j^N} where il; is the intersection point of the ray Uj{tj)—x—tfij, tj>0 with the boundary of the set lev^f (or (better) \eviMCF, PF(k), or \eyjMQCF) i.e., Uj=Uj(tj), tj>0. Then one has S=\x=xit) S i t,<l,tj>0, ^J^n]- I JtN tj J COROLLARY T4.2: The cut-simplices generated by intersection with convex extensions satisfy the following uniform dominance relation: Sf(^SMCF^SMQCF 7.2. Polaroids The main purpose of a polaroid extension in the construction of a cut is to insure depth and validity; and the following irr mediate observations come to mind: a) Validity is defined by x^Sr\X=^x^ lev^/ i.e., the points of X which are actually cut-ojff must lie in lev^t/: {Sf]X) a lev^/. b) Polaroids P* of a given set P can be viewed here as a tool to generate cut-sets S^ F* because under proper assumptions on/, one has P*[]P=P(} lev^J (Lemma 8 and corollaries). c) Finally, polaroids enjoy the following inclusion reversing property, (see [3]). Qc:P=>P*czQ* Thus we may now combine a), b) and c) and define the following: 1) Choose the set Q=(Xr\Sr\ lev*/) where -S is a cut-simpl&x (defined under 3) below). 2) Define the corresponding polaroid CP(k)^{y\<p{x;y)<k,A^x^Q} -{y:<p{x;y)<k,A^x^(Xr[S[] \ev,f}^{y\^{x;y),^x^{Xr] lev,/)}=P(^)3lev*/. 80 C. BURDET (See Lemma 8) 3) S is the cut-simplex obtained by interesction with the polaroid CP; note that Q is defined by S so that S really plays the role of a variable parameter in this construction. Now, by virtue of the inclusion reversing property c), one may define the following (dynamic) geometrical representation of the interaction between S and the polaroid CP: • Starting with a small S (for instance Xr\S= {x}). one obtains a very large CP (since {x}*(k) is a halfspace) ; • Since CP is "enormous", a larger cut simplex SzdS can be generated by ray intersection with CP. • But, now the corresponding CP is shrinking (CPczCP) because of the inclusion reversing property c). Thus, one easily imagines that at some point, this mechanism will stop in some equilibrium posi- tion (usually not unique) where (S can no longer be made larger without violating the condition SnXaCP required to guarantee validity. The set CP{k) is called cutting polaroid to emphasize the interaction between the cut and polaroid constructions : for the vertices of the cut-simplex S one has Uj ^ bd CP{k) , -Vj ^ A^, that is : m.aiX<p{Uj, x)=k subject to x=x— X) cijtj^X ^ltj<l,tj>0,^j; UN tj where the tj are parameters. A computationally simpler variant of this scheme is to start with a given hyperplane support- ing X at X, sa}^ j(N and to "push" it into the feasible set as far as possible to create the cut SM;>5(>0) j(N and the corresponding cut-simplex S; the cutting polaroid approach is then used to determine the (largest) value 5 > such that A^j ^ A^": max (p{Uj, x)<k subject to a;(:-X' * ^ajtj<d,tj>Q,^j^N j(N with <B CONVEX AND POLAROID EXTENSIONS 81 In this case, there is only one parameter 5; note that 5 merely corresponds to the slack variable of the cut constraint (*) ; otherwise 8 only appears in the objective function <p. Thus, the cutting polaroid approach provides a means to determine a deep valid cut by para- metric optimization (with parameter 8). These programs have been studied in [4 and 8] in the quadratic case; this approach is more efficient than the (conventional) intersection approach pro- posed in [5], because this parametric program (linear in the quadratic case) is easil}^ solved. 8. DISCRETE EXTENSIONS WITH RESPECT TO POLYHEDRAL SETS When X is polyhedral, it is well known that the program max /(a;), subject to x^-X' (/ convex) is either unbounded or it possesses an optimal vertex; thus, the range of/ may be restricted to the finite set of values /(vert X) without essentially modifying the final outcome of the above optimi- zation problem. A polaroid extension SPF can thus be defined as a quasi-convex step Junction of the following type: A^k^fivert X), lev, PSF= {y\<p(x; y)<k, A^x^ vert X with/(2;) <^} ; for intermediate values k^ (ki, ^2) one has: lev* PSF^levk, PSF where ki and ^2^/ (vert X), k^f (vert X). In fact, we can modify the definition of PSF further, by selecting in the definition of the polaroid only local optima of the initial non-linear problem, i.e., only the candidates for the global optimum, or (assuming differentiality) : lev* PSF— ly\<p{x; y)<k, for all those x^Xf\\eVkf satisfying the necessary conditions a-x)vj{x)<o,^^^x}. It is easily seen from the definition that lev* P»SFz3P(A:)=lev* PF; i.e., one has the dominance relation PSF(x)<PF{x); note also that for x^X— {vert X} one may have PSF{x)<Cf{x) so that, strictly speaking, PSF is no longer an extension as defined in Section 1. However, PSF enjoys the same properties as another polaroid extension, if one replaces X by vert X; and we therefore do not duplicate the results of the preceding sections. REMARKS: a) The construction of PSF, in principle, requires one to solve (globally) an extreme point program of the type max (fix; y) , y given, Qq= (vert X n lev* /) . This may seem a tremendous task, but any set Q satisfying (^fllev* /)z)Q3Qo may be used to deliver a bound k' which furnishes an approximation (from above) to PSF: max (p< max (p=k' . 82 C- BURDET The step function PSF represents an improvement over the corresponding (i.e., with same polarization <p) polaroid extension PF; and this happens in two ways which are best illustrated by an application to concave programming. (i) Because the parameter k can always be chosen to belong to A= {^=/(a;)|a;^ vert X, x local optimum} the value of k is increased step-wise by discrete amounts, until the value /:=max/(a;) is reached. In a practical sense, this can be realized during the construction of a cut-simplex S, for instance; one sets Z:new=max \koM,j(x),j{x^), ^j^N} where x^ are the neighboring (on X) vertices of x i.e., — *^ X — X ZjQ/j where tj is the largest value tj>0 such that x^^X. (ii) One may replace X by the set r= {x^ vert X, x locally optimal} in the construction of a cutting polaroid (see Section 7) and obtain CP{k) = {rr\S)*{k). ACKNOWLEDGMENT I wish to express my thanks to R. Breu, who contributed to many improvements and correc- tions of the manuscript. BIBLIOGRAPHY [1] Balas, E. and C.-A. Burdet: "Concave Quadratic Programming," Management Science Research Report #299, Carnegie-Mellon University (1972). [2] Burdet, C.-A., "Polaroids: A New Tool in Non-Convex and in Integer Programming," Naval Research Logistic Quarterly 20, 13-24 (1973). [3] Burdet, C.-A., "On Polaroid Intersections," Mathematical Programming in Theory and Practice, P. Hammer and G. Zoutendijk eds., pp. 365-387 (North Holland, 1974). [4] Burdet, C.-A., "On Linearly Constrained Non-Convex Quadratic Programs," W.P. 91-72-3, Graduate School of Industrial Administration, Carnegie-Mellon University (1972). [5] Tuy, Hoang, "Concave Programming Under Linear Constraints," (Russian). Doklady Akademii Nauk SSSR, 1964. English translation in Soviet Mathematics, 1437-1440 (1964). [6] Minkowski, H., "Theorie der Konvexen Korper insbesondere Begriindung ihres Oberflachen- begriffes." Gesammelte Abhandlungen, 2 (Leipzig, 1911). [7] Rockafellar, R. T.: Convex Analysis. (Princeton University Press, 1970). A CUTTING PLANE ALGORITHM FOR THE BILINEAR PROGRAMMING PROBLEM H. Vaish California State University, Northridge Northridge, California C. M. Shetty Georgia Institute of Technology Atlanta, Georgia ABSTRACT In this paper we discuss the properties of a Bilinear Programming problem, and develop a convergent cutting plane algorithm. The cuts involve only a subset of the variables and preserve the special structure of the constraints involving the remaining variables. The cuts are deeper than other similar cuts. L INTRODUCTION The Bilinear Programming Problem considered in this paper can be stated as: BLP: Minimize <^(x, y)=c'x-\-d'y-\-x'Cy (1) Subject to: xtXo={x€R"'\Ex<e, x>0} yeYo=lyeR'^\Fy<J,y>0} Without loss of generality, we will assume that Xq and Yo are bounded. In spite of its special structure, problem BLP is a mathematical statement of a number of practical problems, for example, location-allocation problems, orthogonal production scheduling, multi-stage assignment problem, etc. (See [23] and [28]). An important property of BLP to observe is that even though (f> can be shown to be not quasi- concave, the optimal (x*, y*) is attained at an extreme point of XqXYo, i.e., x* will be an extreme point of Xq and y* is an extreme point of Yo [23, 29]. It seems reasonable that in solving BLP this extreme point optimality property of BLP should be taken advantage of. However, <p is not ex- plicitly quasi-convex so that local minima can and do exist [28]. This is the essential difficulty in solving BLP. Problem BLP can be regarded as a Quadratic Progrkmming problem in which the objective function need not be convex. A number of algorithms have been proposed for solving this class of problems. One group of these algorithms generates a sequence of expanding polytopes such that the minimum over each is known. This annexation procedure terminates when some polytope in the sequence contains the original polytope [13, 27, 29]. Some alternative approaches are discussed in [10, 24]. 83 84 H. VAISH AND C. M. SHETTY The strategy we will use to solve the problem will be to develop a series of cuts such that no point with a lower value of 0(x, y) than the current best available is deleted. The approach is therefore the piecewise-strategy discussed by Geoffrion [14] for solving large-scale problems. The process is repeated until all the feasible region has been explored. II. CUTTING PLANE STRATEGIES Cutting planes have been used in integer programming for some time. Recent developments in [1, 2, 6, 15, 17, 19, 30] show how valid cuts can be generated using certain convex sets. On the other hand, several authors have used the cutting plane approach to solve nonconvex problems, e.g., see [3, 5, 11, 12, 13, 21, 26, 27]. The recent work of Burdet unifies these approaches through the use of Polaroids [6, 7, 9] and the related more general concept of level sets of convex gauge functions [8]. Our main concern here is to use Burdet's approach to solve problem BLP. However, as we will see, some modifications need to be done. Consider a problem P: Mm.f(x), subject to xeS, where jS is a polyhedral (compact) set and /is nonconvex. A local star minimum of P is an extreme point x such that/(x) <j{x) for each xtN(x) where N(x) denotes the adjacent extreme points to x. A local minimum of P is a point x such that /(5)</(a;) for each xe5s(x) fl'S', where Bi(x) is a 5-neighborhood around x. If / is quasi-concave, then a local star minimum is also a local minimum. In such a case, a cutting plane can be developed from a local star minimum x which cuts off x but not any other improving point as was done in [5, 27]. On the other hand, if the assumption does not hold, a local minimum x is needed to define a cutting plane [3, 11, 26]. There is yet another important property we would like to preserve. Cuts involving variables associated with both Xq and Fq sets will destroy their special structure. There are problems wherein one of the sets does have a special structure for which efficient algorithms exist that can be used to solve sub-problems in the solution procedure. In the location-allocation problem, the Fq set can be made to represent the transportation problem constraints. Konno [23] has discussed this strategy in detail for the BLP. At a local star optimim (x, y) he defines a vector ^' and finds the parameter value c such that the cut g'x>a deletes x but no point x such that (t>{x, y)<k—e where k is the objective function value for the current best solution and f>0 is a predetermined value. The procedure thus yields an t-optimal solution, i.e., a point {x*, y*) such that <j>{x*, y*)<<l>{x, y)-\-eior all xeXo and yeYg. Gallo and Ulkiicii [13] have developed a cutting plane algorithm for BLP similar to the one discussed in this paper, where the cuts are applied in the ^-set. Using duality theory they consider the following equivalent problem: Min. ^(x, u) = {c'x-{-Max fu) subject to xeXo and ueU^{u: F'u<d-{-Cx, u<0}. For any oi^eXo, let u" be the point obtained by solving M&xJ'u subject to F'u<d-\-Cx'', u<0. A cutting plane is generated from a vertex x* such that ^(a:*, u'')'^k where k is the current best value of the objective function. Thus, if the vertex x* is in fact the current best, one has to move to the point yielding a poorer value of ^. If no such point exists, special steps need be taken. The algorithm is not guaranteed to converge. Earlier we have indicated that cuts involving both the x and y variables can be generated from a local minimum. However, if we want to develop a cutting plane involving only the x-variables and yet be convergent, we need to develop the cut at (x, y) which is more than a local optimum. i CUTTING PLANE ALGORITHM 85 Such a point which is adequate is defined below. Throughout the rest of the paper by Xq we mean the original feasible region X^ or its subset obtained after the introduction of cuts. DEFINITION: An extreme point {x, y) is called a pseudo-global minimum ij 0(x, y)<<i>{x ,y) for each xtBi{x) D-X'o and for each yeY^. Note that (x, y) is an extreme point of the constraints of BLP if and only if x is an extreme point of Xq and y is an extreme point of Fq. Further, an extreme point is adjacent to {x, y) if and only if it is of the form (a;*, y) or (x, y') where x^eN{x) and 7/eN{y). We will now characterize the various forms of "optimality" for BLP which will be used later in the development of the algorithms. Consider an extreme point (x, y). One can readily transform the origin to this point. Konno [23] has shown that such a transformation will yield a problem of the form given in (1) (with the additional property that c >0,/>0). The following theorems refer to the transplated problem with (x,^) = (0,0). THEOREM 1 : The origin (0, 0) is a local star minimum of BLP if and only if x=0 solves the linear program Pi: Min <^(a;, 0), xtX^, and i/=0 solves the linear program P2: Min <^(0, y), yeYo. PROOF: Let x=0 solve Pi and y=0 solve P2. Then <^(0, 0)<<^(x, 0), xeXo. In particular, <^(0, 0)<<i>{x\ 0), a;*€iV(0). Similarly, </.(0, 0)<(f>iO, y'), y'eN{0). Hence, <f>{0,0)<<{>{x\y'), (x\y^)eN(0,0). Hence, (0, 0) is a local star minimum. Conversely, let (0, 0) be a local star minimum. Hence. <^(0, 0)<(^(0, ?/*), y^eN(0). Consider problem P2. Suppose we apply the simplex algorithm and obtain a basic feasible solution corresponding to y=0. Since <f>{0, 0)<<f>{0, if) for each y^eNiO), the simplex method will terminate and is a solution to P2. Similarly, x=0 solves Pi. THEOREM 2: The origin (0, 0) is a global minimum of BLP if c'x>0, d'y>0 and x'Cy>0 for each xeXo, yeYo- PROOF: Trivial, since the hypothesis implies <t>{x, y) >0(O, 0) for each xtX^ and t/eFo- THEOREM 3 : The origin (0, 0) is a local minimum of BLP if and only if for each xeXo and (i) c'x>0 and d'y>{), and (ii) if x'Cy<0 then {c'x+d'y)yo. PROOF: Suppose conditions (i) and (ii) hold. Then for £>0 small enough <i>{ix, ey) = e{c'x+ d'y-{-(x'Cy) > 0=0(0, 0) since the term within parenthesis is positive. Hence (0, 0) is a local mini- mum. Now suppose (0, 0) is a local minimum. If c'a:<0, then <l>(ex, 0) = €c'x<0 = <^(0, 0) contradicting the local optimahty of (0, 0). Hence, c'x>0, and likewise d'y>0. Now if x'Cy<iO and (c'x->rd'y)<0, then (f>{ex, €y) = e{c'x+d'y-{-ex'Cy)<CO=<t>{0, 0) for all €>0 again contradicting the local optimality of (0, 0). Hence, both conditions (i) and (ii) hold. THEOREM 4 : The origin (0, 0) is a local star minimum of BLP if and only if c'x > and d'y > for xeXo, yeYo. PROOF: Suppose (0, 0) is a local star minimum. Then from Theorem 1 x=0 is a solution to the problem: Min <f>{x, 0)=c'x, xtXo. 86 H. VAISH AND C. M. SHETTY Hence, c'x>0 for each xeXo- Likewise d'y > for each yeVo- Conversely, let c'x > and d'y > for each xeXo and yeYo- But ^(a;, 0)=c'x and <^(0, 0) = 0. Hence x=0 solves: Min (i>(x, 0), xeXo- Likewise y=0 solves: Min <^(0, y), yeYo- Then from Theorem 1 we have (0, 0) is a local star minimum. THEOREM 5: The origin (0, 0) is a pseudo-global minimum of BLP if and only if for each xcXq and yiYo. (i) d'y>0, and (ii) if c^V=0 then (c'a;+a;'C2/)>0. PROOF: Suppose (0, 0) is a pseudo-global minimum. By definition it readily follows that it is a local minimum. Hence, from Theorem 3 we have d'y>Q. Now for xtXo, axeBsiO) ("1 -^o for a>0 small enough. Then <i)(ax, y) = a{c^x-{-x'Cy)-\-d'y><f>{0, 0) = for each yeYo since (0, 0) is a pseudo-global minimum. Hence, if d'y=0 then c'x-\-x'Cy>0. Now assume that conditions (i) and (ii) hold. Consider the m edges from a;=0 to the adjacent extreme points. Let 0?^a;*e.X'o be on the i"" edge. Let y^, . . ., y'^ denote the K extreme points of Yq and let l-dYKc'x'+x'Vy") if ic'x*-{-x''Cf)<0 "'*~1 00 otherwise Note that q:,j:>0 by assumption (i) and (ii). Let ai= min Uiic. k = l K If all «,*= 00 we will let at to be an arbitrarily large number. Then <i>{aix\ y'') = ai{c'x^-\-x''Cy'')-\-dy''>0 from conditions (i) and (ii) and the definition of «<. But, the minimum of <t>(x, y) for a fixed x is achieved at an extreme point of Yq. Hence, Min<^(Q;iX', i/)>0 for each i=\, . . ., m. Now let AS'=Conv. [0, a^a;' for i = l, . . ., m]. Then Min <i>{x, y), xeS, yeYo is achieved at an extreme point of S, which is or of the form atxK Hence, Min <f>ix, y) >Min. {<^(0, 0), Min <i>{aix\ y) } >0. X(S yt Yo Now select a5>0 such that 5^(0) fl Xo<^S. Then Min <t>{x, y)>0 for each xeBi{0) n Xo and for each yeYo. That is, (0, 0) is a pseudo-global minimum. It may be mentioned that the corresponding theorems in [23] can be obtained as special cases of Theorems 2,3, and 4. Further, from Theorems 3 and 4 we may observe that a local minimum is always a local star minimum. The converse may not be true as shown in [28]. Further, from the definitions one can note that a pseudo-global minimum is always a local minimum. m. GENERATION OF A PSEUDO- GLOBAL MINIMUM Before discussing the procedure for getting a pseudo-global minimum, we will review the method due to Balas [1] for identifying precisely m edges incident on an extreme point x, which also leads to the resolution of degeneracy. CUTTING PLANE ALGORITHM 87 Let X be an extreme point of Xq and let pj, jeJ, be the m nonbasic variables at x, where J is the index set for the nonbasic variables. Denoting by I' the columns of the simplex tableau in Tucker form (extended form) , the m-vector x can be written as : x=x— y^.Ppi which satisfies the constraint Ex<e (but not necessarily the nonnegativity restriction) for any P;>0. Let M and N denote the index sets associated with the constraints Ex+u=e and x>0 respectively, i.e., ^0= xeBr; S e,jXj-\-Ui=eu uN and x,>0, jtM { i=i Given a basic feasible solution (x, u), let N°={ieN: Ut is basic and ^i=0}, M°= {jeM: Xj is basic and Xj=0}. Let ( 771 xiR"". ^eijXj+Ui=ei, ieN-N°; x,>0, jeM-M" Then clearly, Xq^^X'o since X'o is obtained by deleting constraints of Xq. THEOREM 6 [1]: Let x be an extreme point solution obtained by solving a linear program: Min. /3'x, xeXo. Let X'o be defined as above. Then x is a vertex of X'o and 0'x=l3'x is a supporting hyperplane for X'o- Besides, X'q has precisely m distinct edges incident on x and each half line ^^—{x: x=x—'e^Xj, >^j'>0}, jeJ contains exactly one such edge. We will now present two algorithms to generate a pseudo-global minimum. ALGORITHM A. L Find a feasible extreme point x^ of ^o- 2. (a) Solve: Min (^(x\ y), yeYo, to yield an optimal y^. (b) Solve: Min <j>{x, y^), xeXo, to yield an optimum x^ Repeat until the procedure converges to a point (x, y), which clearly is a local star minimum. 3. Generate all alternative optimal extreme point solutions 2/S . • •, 2/*(^>l) to Min. <^(x, y), yeYo- Solve: Min <^(x, y'), xeXo for i=l, . . ., k to yield solutions x\ . . ., x*. If <^(x, y)<<t>(x^, y^) for all i, terminate; (x, y) is a pseudo-global minimum. If 4>{x^, 2/0<C<^(^> V) for some r go to step 2(a) with x^=x'. ALGORITHM B. 1 and 2. Find a local star minimum as in steps 1 and 2 of Algorithm A. 3. Let x\ . . ., x"* be the adjacent extreme points of x. Solve: Min <i>{x\ y), yfYo, to yield solu- tions y^, . . ., y'^. If <p(x, y)<<t>{x\ y^) for all i, terminate; (x, y) is a pseudo-global minimum. If 4>{^', y')<<f>ix, y) go to step 2(b) with y^=-y\ It is intuitively evident that Algorithm B yields a pseudo-global minimum. However, to implement it we need a ready means of identifying the adjacent extreme points of x. Also note that both of the algorithms are finite. LEMMA 7 : Let the origin be translated to the point (x, y) obtained either by Algorithm A or by Algorithm B. Then (0, 0) is a pseudo-global minimum. 88 H. VAISH AND C. M. SHETTY PROOF: First consider Algorithm A. The algorithm is terminated at a point (x, y) which is clearly a local star optimum from Theorem 1. Hence, from Theorem 4 we have d'y>0. Now con- sider a yeYo satisfying d'y—0. Clearly it is an alternative optimum to Min 0(0, y), yeYo and can be expressed as 2/=Z;x,2/M:x, = 1, X,>0 ! = 1 where y* are the alternative extreme point optima at step 3. Hence, for xe-X^o using the notation of step 3, c'x+x'Cy=4>{x, y)=4>{x, j:\iy') = ^'Ki<t>{x, y')>J2K<t>ix', t)>4>{0, 0)=0. Hence, from Theorem 5, (0, 0) is a pseudo-global minimum. Now consider Algorithm B. Let S= conv [0, x^, . . ., x"^] where x^eN{0). Then Min. <t>{x, y), xeS, y(Yo is achieved at an extreme point of S and Yq- From step 3 of the algorithm <l>{0, 0)<<t>(x\r)<(l>(x,y) for each xeS, yeYo. Select a 5>0 such that ^^(O) fl XqCzS. Then clearly (0, 0) is a pseudo-global minimum by definition. IV. A CONVERGENT CUTTING PLANE ALGORITHM We will now show how a cut can be generated from a pseudo-global minimum using the con- cept of generalized polars [6, 7]. DEFINITION : The generalized polar of Yq for a given scalar k is given by Y°(k) = {x: <j>(x, y)>k for all yeYo] By definition Y^{k) contains no point xeX^ such that 0(x, y)<Ck for some ytYo. Hence, if k is the current best value of the objective function, the problem is solved if XoCzF°(^). Further, it can easily be verified that Y'^i'k) is compact and convex. As a matter of fact it is polyhedral since it can be shown [28] that Y\k)=^{x; <i>{x, y') >k for all yUV] where V is the set of extreme points of Yq. We will now discuss how a valid cut can be generated involving only the x-variables using generalized polars. Let (x, 7) be a pseudo-global minimum of BLP and let 'p'=(x', «')• Suppose the current best value of is k, which may or may not be equal to <^(x, y). DEFINITION: Given m positive scalars Xi, X2, . . . , X,„, then the inequality is a valid cutting plane with respect to p' = {x', w') if but for all ptP such that <j>{x, y)<Ck for some yeYo where P= {(x', u') : Ex+u=e, x>0, w>0}. CUTTING PLANE ALGORITHM 89 A valid cutting plane thus cuts off the pseudo-global minimum but does not cut off any feasible point of Xo which along with a yeYo yields an objective function value smaller ihan k. The following theorem states how a valid cutting plane can be generated from a pseudo-global minimum. THEOREM 8: Let (x, 1) be a pseudo-global minimum and let the rays ^^ be as defined in Theorem 6. Let k be the current best value of <f>. Let X^ be defined by X,=max [X/. x-e%eY'{k) } if e^Y'{k) = 00 if ^^ c Y^ik) for all X^ > 0. Then the inequality is a valid cutting plane. PROOF: For notational convenience, let us translate the origin to (i, y). Since <^(0, y) >0(O, 0) for all y«Fo, OeF°(A:). From Theorem 6, X' q has precisely m edges incident on and each ^^ contains one such edge. Let F^xeX'o be on ray ^K We will show that there exists a>0 such that <l>iajX, y)>0 for 0<a^<a for all ytY". Suppose, on the contrary (2) <l>iajx, y)=^aj{c'x+x'Cy)-\-d'y<0 for some yeYo. Since (0, 0) is a pseudo-global minimum, d'y>0 by Theorem 5. If d'y=0, by Theorem 5 (c'x-\-x^Cy)>0 for all xeXo, i.e., <i>{x, y)><(>{0, y) since d'y=0. This implies x=0 solves Min <t){x, y), xeXo. From Theorem 6 then <f>(x, y) = <j>{0, y) is a supporting hyperplane to X\ at x=Q. That is, <j)(x,y)=c'x-\-x'Cy>0 for each xeX'o and in particular for x=x. Thus, <t>iajX, y) >0 for all q:j>0, a contradiction. On the other hand, if d'y^O, from the expression for <l>{ajX, y), we have 4>{ajX, y) >0 for all A \-d'yl{A+x'Cy) if (c'x+x'Cj)<0 00 otherwise Again this is a contradiction. Now let a= mm Uj j where a can be let to be arbitrarily large if all «;= oo. We have <^(q;,x, y) >0 for 0<aj<a for all 2/€F°. In other words, ajX=(x-'~JX,)tY°{k) with Xj>0 for each ./eJ. Hence, noting that F°(Z:) is closed, X;>0 exists, i.e., U<l/X,< oo and {^—7Aj)tY''{k). To show that the cutting plane generated at (0, 0) is a valid cut, first note that S ^A=0<1 since the nonbasic variables ■-€) 90 H. VAISH AND C. M. SHETTY are zero. To complete the proof, consider a feasible -(I) such that 4>{x, y)<Ck for some yeVo- Note that feasibility implies p>0. We will show that Note that by definition of X;, Tam<t>{—e'\j, y)=k or 0(— e^X,, y)>k. Hence, x is in one open half space defined by 0(x, y)=k and (—e'Xj), jej are in its complement which is a closed half space. Thus, p is feasible to the cut since the hyperplane '^Vtl^j—^ passes through the points (— e-'Xy). This completes the proof. In order to define a cutting plane we then need to determine X^ specified in Theorem 8. By definition, for ^'=1, . . ., m (3) X^= max Jmin c'(i— e"'X^) -\-d^y+ {x—e%)^Cy>k\ This amounts to solving m parametric linear programming problems over Fq- One way of imple- menting this will be as follows : let ^(X,) = min {c' (x-?%) +(f' y + (x- e%) 'Cy}. Since ^ is a concave function of X^ (see [20]), it is unimodal. We can conduct a search for X^ over an interval (0, Z) where i is a large enough number, using Bolzano Search [31]. A linear program- ming problem is solved for a fixed value of X, in the interval. Note that we elected to apply the cut only in the x-variables in order to take advantage of ease in solving problems in the ^/-variables. At each iteration of the search, the "interval of uncertainty" is reduced by 1/2. The search is ter- minated when a value of Xy=Xj is obtained such that^(Xj)— A:<|8 where /8>0 is a prespecified tol- erance level. Observe that Xj<X, and hence the cut using X, is also a valid cut. The search proce- dures will require the solution of at most (;p+l) linear programs where p is the smallest positive integer for which L/2''<;8. The proposed cutting plane algorithm can be summarized as follows: 1. If the unexplored feasible region Jlq* at stage i is empty, terminate. Otherwise, find a pseudo- global minimum. If necessary, update the value of the current best solution. 2. Solve the parametric linear programs to obtain X^. 3. Introduce the cut and return to step 1. Convergence of the algorithm is readily proved. Let {x*, ?/'} be the sequence of pseudo-global minima generated and B.{-x>) be tho cutting plane generated at x*. At stage i, the algorithm is terminated if XonH+(x*)=«A. Otherwise, the cut H(x*) is applied and a new pseudo-global point CUTTING PLANE ALGORITHM 91 (x'+\ y*"*"^) is found where x^'^^eXof\H'^{x^) and x' iH'^{x'). We wish to show that the sequence {x^} has a limit point a;* such that Xor\H'^{x'') = (t>. Since the points x' are in a compact set, there exists a limit point x^ such that for a given e>0 and a positive integer N, Hx' — a;*||<£ for some i>N. If Xo[]H'^(x'')7^<t>, all subsequent pseudo- global minima (x', y') generated will satisfj' the condition a;'€//+(x*), l>k+l. Also, from the proof of Theorem 8 x'eBiix') nX'> and x' t xB^ix") for some 5>0. Hence, ||a;'-x*||>5 for all l>k-\-l. This contradicts the statement that x* is a limit point. Hence, X° fl H'^ (x'') = ({> and the algorithm is terminated. We can readily show that the cuts generated using generalized polars are uniformly stronger than those generated by Konno [23]. This cut is of the form where the dj are positive constants which are selected arid o- is a parameter which is defined to satisfy certain conditions. For each jtJ, Konno calculates the maximum value of dj such that {x—l^(Xj/dj)€Y°{k), and then selects a such that ff/dj<(rj/dj for all jeJ. Now (Tj/dj<'\j, hence ff/dj<Xj for all jeJ. Hence the cuts generated from Y°(k) are stronger than those generated by Konno's method. There is a revealing geometric interpretation to this difference. Konno predetermines the coefficients dj of the cutting plane so as to simplify computational work. But this has the effect fo fixing the slope of the hyperplane. The hyperplane is now translated parallel to itself till such time as one point on it touches the boundary of Y°{k). The polar cut allows the additional flexi- bility of altering the slope of the cutting plane so as to cut off more of the feasible region. V. A FINITELY CONVERGENT e-OPTIMAL ALGORITHM The above algorithm can be converted to a finitely convergent e-optimal solution procedure by using Y°{k-e) instead of Y°{k) in determining X_,. In this case, a cut can be initiated at a local star minimum rather than a pseudo-global minimum. Note that from Theorem 1, <t>(0,y)><f>i0,0)>k>k-e, for each yeYo. Hence, Oe int Y°{k — e). This implies that there exists a a such that (piajX, y)>k for 0< aj<a. and for all ytYo where x is as defined in the proof of Theorem 8. The remaining parts of Theorem 8 hold. Hence, a valid cut can be generated from a local star minimum. We will now show that the algorithm is finite. At stage i, suppose the local star minimum is (x, y) and let kt be the current best value of <f> attained at (x, y), which may or may not be the same as <^(x, y). Then (x, y) is the e-optimal solution over all yeYo and over Xo already explored, i.e., for all x in Xq, but not in Xo"'"'. Note that kt is decreased at each stage at least by a fixed €>0. Hence, denoting by (x, y) the global optimum of BLP, if we show that x is cut off by a cut obtained from Y°(k) for some k, then the algorithm is finite. Now consider BLP with (x, y) as the local star minimum. If (x, y)eX'o^^ is the global minimum, then X is in the cone with vertex x and ^^, jtJ, as the generators. Then x can be expressed as a convex combination of points x^, 'jtJ, on these generators. 92 H. VAISH AND C. M. SHETTY Let a,= min (f>(x\ y) ytYo and k= min aj. Then Y°(k) will cut the rays ^^ at points (0, . . ., Xy, . . ., 0) >xK Hence, obtained from Y°{k) will cut off x. This shows that the e-optimal algorithm is finite. VI. GENERATION OF DEEPER CUTS We are grateful to the referee for bringing to our attention some recent work of Burdet [8, 9] dealing with convex gauge functions which permits generation of deeper cuts than that given in Section IV. This approach allows X^ to be negative and generates uniformly dominating cuts. Owen [25] seems to be the first to suggest this aproach, and related work on negative edge extensions has been investigated in [4, 16, 18, 22]. Using these results the cut can be strengthened by using the following definition for X^. (max {Xj< 00 : <^(i-?%, y)>k for all yeY''] if ^^qt Vik) X,=( x,>o [max {x,< oo: <j>{x+e%, y) >k for some yeF"}] if ^^(^Y^k) X,>0 If i'c^Y'^{k), 0<Xy< CO can be computed from equation (3) given earlier by solving parametric transportation problems. If i,^ciY°{k) as pointed out by the referee, the value of X_,<0 can be com- puted by solving parametric problems similar to that used for (3). More specifically, in this case we need to solve the parametric problem: X;= — [max {X;: max (^(i+c%, y) >k\]. X,<0 !/«Ko REFERENCES [1] Balas, E., "Intersection Cuts — A New Type of Cutting Planes for Integer Programming," Operations Research, 19, 19-39 (1971). [2] Balas, E., "Integer Programming and Convex Analysis: Intersection Cuts from Outer Polars," ( Mathematical Programming, 2, 330-382 (1972). [3] Balas, E., "Nonconvex Quadratic Programming Via Generahzed Polars," Management Science Research Report No. 278, GSIA, Carnegie-Mellon University (1972). [4] Balas, E., "Disjunctive Programming: Properties of the Convex Hull of Feasible Points," Management Science Research Report 348, GSIA, Carnegie-Mellon University (1974). il CUTTING PLANE ALGORITHM 93 Balas, E., and C. A. Burdet, "Maximizing a Convex Quadratic Function Subject to Linear Constraints," Management Science Research Report No. 299, GSIA, Carnegie-Mellon University (1973). Burdet, C. A., "Polaroids: A New Tool in Nonconvex and in Integer Programming," Naval Research Logistics Quarterly, 20, 13-22 (1973). Burdet, C. A., "On Polaroid Intersections" in Mathematical Programming in Theory and Practice, P. Hammer and G. Zoutendijk, eds. (North Holland, 1974). Burdet, C. A., "Elements of a Theory in Nonconvex Programming," to appear in Naval Research Logistics Quarterly. Burdet, C. A., "Convex and Polaroid Extensions," to appear in Naval Research Logistics Quarterly. Cabot, V. A., and R. L. Francis, "Solving Certain Nonconvex Quadratic Minimization Prob- lems by Ranking the Extreme Points," Operations Research, 18, 82-86 (1970). Candler, W. and R. J. Townsley, "The Maximization of a Quadratic Function of Variables Subject to Linear Inequalities," Management Science, 10, 515-523 (1964). Cottle, R. W., and W. C. Mylander, "Ritter's Cutting Plane Method for Nonconvex Quadratic Programming," in Integer and Nonlinear Programming, J. Aladie, ed., (North Holland, 1970). Gallo, G., and A. tllkucu, "Bilinear Programming: An Exact Algorithm," Report ORC 73-26, Operations Research Center, University of California, Berkeley (1973). Geoff rion, A. M., "Elements of Large-Scale Mathematical Programming," Management Science, 16, 652-691 (1970). Glover, F., "Convexity Cuts and Cut Search," Operations Research, 21, 123-134 (1973). Glover, F., "Polyhedral Convexity Cuts and Negative Edge Extensions," Zeitschrift fiir Operations Research, 18, 181-186 (1974). Glover, F., "Convexity Cuts for Multiple Choice Problems," Discrete Mathematics, 6, 221-234 (1973). Glover, F., "Polyhedral Annexation in Mixed Integer and Combinational Programming," Mathematical Programming, 8, 161-188 (1975). Glover, F., and D. Klingman, "Concave Programming Applied to a Special Class of 0-1 Integer Programs," Operations Research, 21, 135-140 (1973), Hillier, F. S., and G. J. Lieberman, Introduction to Operations Research (Holden-Day, 1974). Hu, T. C, "Minimizing a Concave Function in a Convex Polytope," Mathematics Research Center Report No. 1011, U.S. Army, Madison, Wisconsin (1969). Jeroslow, R. G., "The Principles of Cutting-Plane Theory: Part 1," with an addendum GSIA, Carnegie-Mellon University (1974). Konno, H., "Bilinear Programming," Parts I and II. Technical Report No. 71-9 and 71-10, Operations Research House, Stanford University (1971). Mueller, R. K., "A Method for Solving the Indefinite Quadratic Programming Problem," Management Science, 16, 333-339 (1970). Owen, G., "Cutting Planes for Programs with Disjunctive Constraints," Journal of Optimiza- tion Theory and Applications, 11, 49-55, 1973. Ritter, K., "A Method for Solving Maximum-Problems with a Nonconcave Quadratic Objec- tive Function," Z. Wahrscheinlichkeitstheorie verw., 4, 340-351 (1966). 94 H. VAISH AND C. M. SHETTY [27] Tuy, H., Concave Programming Under Linear Constraints (Russian) Doklady Akademii Nauk SSSR (1964). English translation in Soviet Mathematics, 5, 1437-1440 (1964). [28] Vaish, H., "Nonconvex Programming with Applications to Production and Location Prob- lem's," Unpublished Ph.D. Dissertation, Georgia Institute of Technology (1974). [29] Vaish, H., and C. M. Shett}^ "The Bilinear Programming Problem," Naval Research Logistics Quarterly, 23, No. 2 (1976). [30] Young, R. D., "Hypercyclindrically Deduced Cuts in Zero-One Integer Programming," Operations Research, 19, 1393-1405 (1971). [31] Zangwill, W. I., Nonlinear Programming — A Unijied Approach, (Prentice-Hall, 1969). to THE EFFECT OF CORRELATED EXPONENTIAL SERVICE TIMES ON SINGLE SERVER TANDEM QUEUES C. R. Mitchell U.S. Air Force Academy A. S. Paulson Rensselaer Polytechnic Institute C. A. Beswick University of South Carolina ABSTRACT An investigation via simulation of system performance of two stage queues in series (single server, first-come, first-served) under the assumption of correlated exponential service times indicates that the system's behavior is quite sensitive to departures from the traditional assumption of mutually independent service times, especially at higher utilizations. That service times at the various stages of a tandem queueing system for a given customer should be correlated is intuitively appealing and apparently not at all atypical. Since tandem queues occur frequently, e.g. production lines and the logistics therewith associated, it is incumbent on both the practitioner and the theoretician that they be aware of the marked effects that may be induced by correlated service times. For the case of infinite interstage storage, system performance is improved by positive correlation and impaired by negative correlation. This change in system performance is reversed however for zero inter- stage storage and depends on the value of the utilization rate for the case where interstage storage equals unity. The effect due to correlation is shown to be statis- tically significant using spectral analytic techniques. For correlation equal unity and infinite interstage storage, results are provided for two through twenty-five stages in series to suggest how adding stages affects system performance for p>0. In this extreme case of correlation, adding stages has an effect on system performance which depends markedly on the utilization rate. Recursive formulae for the waiting time per customer for the cases of zero, one, and infinite interstage storage arc derived. 1. INTRODUCTION First, we describe two physical settings where dependent service times can be expected to arise. In a paper mill, large rolls of paper typically pass through an inspection or winding operation prior to being cut into smaller rolls. A poor quality roll takes a relatively longer time in the inspection process because defective sections must be removed and splices made. When this same roll reaches the final cutting stage it must be processed more slowly to avoid breaking the splices and to repair them when they do break. Hence process times at the two stages tend to be correlated ; indeed, it is conceptually possible that they be highly correlated. The process times at the two stages on any 95 96 C. R. MITCHELL, A. S. PAULSON AND C. A. BESWICK two different rolls would generally be independently distributed. In the current context consider- able interest would be centered on the effect, if any, produced by nonindependence of process times at different stages. Jackson [9], in discussing queueing systems with phase type service, pointed out that a typical sequence of events in the overhaul of an aircraft engine consists of stripping, detailed examination, repairs, assembly, and testing. Generally, an engine with a large number of maintenance require- ments can be expected to spend more time in each of the latter four phases and so the possible effect of correlated service times on throughput time would be of interest. It is not difficult to envision a host of other situations involving queues in series in which the service times at the various stages for a given customer are correlated. A large proportion of the literature concerning tandem queii's has centered on Poisson arrival processes, exponential service times, and steady state solutions. The assumption of independence of service times is intricately interwoven into the fabric of the traditional birthdeath equation ap- proach to finding a transient and steady state solution to the tandem queueing phenomenon. We shall remain within this same framework with the exception that we shall drop the heretofore uni- versal (but tacit!) assumption of mutual independence of all exponential service times. An obvious approach is to use a multivariate exponential distribution with non-zero correlations in place of the usual independent exponential service times. In our situation it is not clear that the birthdeath equation approach can be modified to incorporate dependent service times. Moreover, any such formulation would very likely be analytically intractable. The problem is, however, amenable to a simulation approach and it is in this way that we assess the effect of departures from independence of service times on steady state system performance. 2. THE SERIES QUEUEING SYSTEM AND RECURSIVE FORMULAE Consider the series queueing process depicted in Figure 1 . Customers from an infinite popula- tion arrive at a two stage s3^stem according to a Poisson process with mean rate X which we shall, without loss of generality, take to be unity. An unlimited queue is always allowed before the first stage, but before the second stage the queue length may be either restricted or unlimited. A single server is allowed at each stage; the service discipline is first-come, first-served. CUSTOMERS ARRIVE IN ACCORDANCE WITH A POISSON STAGE I STAGE 2 PROCESS WITH INTENSITY X n+2,^nti,"'n,''n-i SINGLE ELEMENT SERVER INTERSTAGE STORAGE OF CAPACITYq-l SINGLE ELEMENT SERVER CUSTOMER SERVICE TIMES ARE GOVERNED BY A BIVARIATE EXPONENTIAL DISTRIBUTION WITH MEANS/A, AND^j AND CORRELATION P,- .25 < p <l.0. Figure 1. — Two stage series queue with dependent service times. The system performance measure is taken to be mean waiting time per customer and in this section we develop a set of formulae to recursively^ compute the waiting time per customer. We use the rccursivri formulae for the unlimited interstage storage case in order to demonstrate pre- cisely how the queueing system consisting of two stages in series with dependent service times is CORRELATED EXPONENTIAL SERVICE TIMES 97 related to a single server system with interdependent arrival and service processes as discussed by Bhat [2]. An interpretation of Conolly [4] for a special type of this latter interdependence is shown to be helpful in suggesting why mean waiting time is affected by correlated service times. Denote by (T^.i, Tn.2) the times between arrival epochs of customers c„_i and c„ at the first and second stages and let c„ experience the service times (<S„,i, S'„,2) at each stage, 7i = l, 2, . . . . The sequences of interarrival times (r„,i, Tn.2} and the {Sn,\, S„,2) for different customers are both assumed to be mutually independent. We take (Wn', Wn^) to be the waiting times, excluding service, and iWn\ W"') to be the total waiting times, of customer c„, at the respective stages, n=l, 2, . . . . We illustrate these definitions with an arbitrarj^ combination of arrival and service times in the following diagram. The illustration is for two queues in series with unlimited interstage storage ; diagrams like this are useful in developing the recursive formulae for the different cases to be presented. Cr, ARRIVES 1st STAGE IME c„ DEPARTS 1st STAGE, ARRIVES 2nd STAGE c,+i ARRIVES 1st STAGE c„+i DEPARTS 1st STAGE, ARRIVES 2nd STAGE KiS wii^ W!^^ Wi« USTOMER SnA Sn 2 c„ <> w^^u w^u USTOMER -T < + 1.1 5> On+1 I 'S'n+1.2 Cn+l T <ll -L n- f-1,2 CASE A: Two stage queues in series, unlimited interstage storage. Customer c„+i's total waiting time at the first stage and interarrival time at the second stage are given by fS„+i.i, ifr„+,.x>T^»' (1) WitlM r<»-r„+i.i+5'„+i,„ if r„+,i<w^^" and (2) n+l.2- ,On+l, 1) ifr„+,i<t^i" The condition in (1) and (2) that T„+i^i>Wn^ simply means that c„+i arrives at server one after c^ has departed, and likewise Tn+\,i<C.Wn^ means c„+i arrives before c„ leaves. Similar to (1), c„+i's waiting time at the second stage is (3) W(2) — . •^'n+l-:— \Sn+K2, ifr„+1.2>m^' 98 €. R. MITCHELL, A. S. PAULSON AND C. A, BESWICK The above diagram illustrates (1), (2), and (3) for r„+,.,>W^" and Tn+i.2<W'^\ Similar diagrams result for the remaining conditions. In an obvious way, we can use these relationships to build up a set of recursive formulae for any number of stages in series where the interstage storage between stages is unlimited. (See Appendix.) Since each customer must proceed through both stages, the output of the first stage becomes the input of the second stage and therefore we have, in steady state, that the time interval between arrivals at the second stage satisfies a Poisson process with the same interarrival intensity parameter X as the input distribution [3]. Unlike the first stage however, c„'s service time at the second stage is correlated with the interarrival time there. In the above diagram, this corresponds to a correla- tion between iS'„4.i.2 and Tn+i.2- This result is apparent from (2) since S„+i^i and /S'„+i.2 are de- pendent by assumption. If S„+i,i and S„+i_2 are independent as is usually assumed for two stage server systems, then each stage, in steady state, can be analyzed independently, and since T„+i,2 and Sn+1,2 are inde- pendent, as are T„+i,i and S„+i,i, the regular M/M/1 results obtain for each stage. Bhat [2] describes five different classes of single server first-come first-served systems with Poisson input and exponential service times which result from relaxing some of the assumptions of independence which are typically made. These classes represent more realistic operating systems than those with assumptions of independence; Bhat further points out that more work needs to be done on these problems than the limited amount reported at that time (1969). One of these classes is for systems with interdependent arrival and service processes as is the case here for Tn+1.2 and /S'„+i,2. Conolly [4] and Conolly and Hadidi [5] have studied a dependent structure somewhat similar to this wherein the ratio of service time to interarrival time is constant for all n; they give transient as well as steady state results for the system. Conolly showed numerically that this pattern of server behavior results in a drastic reduction in the mean and variance of the waiting time as compared with a conventional M/M/1 queue. It was noted by Conolly that this kind of server behavior is to be expected from a well regulated service facility where the server adjusts the service time of a customer according to that customer's interarrival time, which the server observes with- out error. In this way, a long interval gives rise to a long service time, and short intervals corre- sponding to a succession of rapid arrivals are followed by correspondingly short service times. This regulated behavior therefore prevents a long queue from forming and cuts down on the mean and variance of the waiting time in the system. Returning to the two stages in series problem under study we see that this system, via equa- tion (2), can be viewed as a type of self -regulated system since Sn+i_ 2 and T„+i_ 2 are related, al- though not in the deterministic way assumed by Conolly. It will be demonstrated later that our type of stochastic dependence between <S„+i. 2 and Tn+i. 2 gives rise to results which are consistent with ConoUy's. This artificial way of viewing the system as a self-regulating device is employed solely to make the effects seem more reasonable and in no way influences the results. CASE B: Two stage queues in series, no interstage storage. For this case, c„'s total waiting time at the second stage, W„<^', is always equal to the Sn. 2 so the only quantity of interest here is W„"'. Since there is restricted (zero) interstage storage, the phenomenon of blocking occurs and so the waiting time computation is a bit more complicated than in Case A. (In eflfect, the first server's utilization is diminished [17].) CORRELATED EXPONENTIAL SERVICE TIMES 99 The total wating time forc„+i at the first server is given by one of four relationships depending upon whether or not c„+i arrives at stage one before or after c„ leaves. For Tn+U l<Wn''\ (4) andfor r„+:, i>W„<^', Wl!U= fTr "' — Tn+i,i-hS„+i,i, if o„+i,i>»S'„,2 \wi,''-T„+:,,+S„,2, if Sn+l.X<Sn.2 (5) TJ/d) _l ►* n + 1 — 1 l'S'n+1, 1, if»S„+M>W<»-T„4-..l + 'Sn.2 The following diagram illustrates (4) for ,S;,+i, i<»S„, 2- Wi. CUSTOMER Cn CUSTOMER < r„+i., > |^,a)^ ^^^^_^ BLOCKING OF SERVER 1 s n+1. 2 Similarly, the other conditions can be verified. CASE C: Two stage queues in series, interstage storage capacity equals one. As in the previous case blocking can occur at the first stage but here a customer's total waiting time at stage two can exceed the service time since interstage storage is permitted on a restricted basis. If r„^.,i<w^^', (6) '' n + 1 {WL'^-T,+,,,+Sn+^.u if^„+.,i>W^f-.S..2 and c„+i's interarrival time at server two is (7) n+1, 2- If r„+,i>W<», (8) "and (9) W<" — " n + 1 — •Jn+l, 1) if'S„+,,:>W^i"-7'n+i.i+W'f-'S'„. ^n+1.2 — W^^" -r„+i, ,+-pFf -5„+2, if S„+,, ,<Wl!^ -r„+i, i+PFf -S„, fr„+,,,-w^<"+5'„+,„ifs„+i,i>w^»'-r„+i.i+-H^f-<s„.2 n7(2)_Cr iiSn+r.l<Wl,'^-T„+y^ + Wir'-S„.2 100 C. R. MITCHELL, A. S. PAULSON AND C. A. BESWICK Next the total waiting time of Cn+i at stage two, Wnli, is computed by using (3) in Case A with Tn+i.2 as defined in (7) or (9). The following diagram is descriptive of (8) and (9) where Tn+i.i>.Wn+i and CUSTOMER CUSTOMER Cn+l W (1) Sn.\ 'Tn+\.r W), w^;u J-n + \.,2 ~ s. n.2 "^n + 1 ^n+l. 3. A BIVARIATE EXPONENTIAL DISTRIBUTION There are a number of bivariate exponential distributions which could be used to describe the dependence assumed between Si and S2 (we drop the subscripts n for now). We choose to use a special case of the bivariate gamma distribution discussed by Wicksell [18] and Kibble [12] and more generally by Krishnamoorthy and Parthasarty [14] and Paulson [16]. The functional form can be written as (10) y(,„,,)=^^ ,-.,*-.«/. ((2 1^)^) where Si>0, S2>0, and is the modified Bessell function of the first kind and order zero. Here a>0, c^>0, and a-\-d=l. (See Downton [7]) The actual simulation of variates - \s,/ from the distribution (10) and its generalization due to Paulson [16] is affected via (11) S^=Xi+ViX2+V,V2X^+ ■ ■ •; here Xj is a 2— vector of independent exponential variates with mean vector ^02/ and the Vj are random 2X2 matrices which take on values in the set ^0 0\ /I 0\ /O 0\ /I ON 0/ \0 0/ \0 1/ \0 1 'CO if( ins CORRELATED EXPONENTIAL SERVICE TIMES 101 with probabilities a, b, c, d respectively. All the X/s and V/s are mutually independent. Note that eventually the product UVj will result in a matrix of zeroes and so with probability one S^ is represented by a finite sum [11, 13]. The bivariate random variable S„ in (11) has mean vector and covariance matrix (13) x;= Nia+b) f aA—hc '\ ad— be /H'2 (M2)^ • and for the correlation p, — 2.5<p<Cl- 4. SIMULATION RESULTS AND INTERPRETATION Simulated results are presented in this section for three cases of interstage storage capacity: (A) infinite (2=°°), (B) zero (2=1), and (C) one iq=2). For the infinite interstage storage case results will be given for two stages in series for various values of correlation and for two through twenty-five stages in series for correlation equal unity. The latter depicts how adding stages might affect system performance given correlation p>0; more precisely, it provides an envelope within which system performance will vary since for a fixed number of stages and utilization correlation unity provides an extremum and correlation zero provides another. In each of the three cases we allow infinite storage before stage one. In the ordinary case in which the correlation between paired service times is zero, a few steady state results are available for comparison purposes. We have taken the mean arrival rate to be unity and so the steady state utilization, v, at stage i is simply the mean service time jUj. It will suffice for our purposes to take ;lii=/h2=m since similar steady state behavior will obtain for mi^A'2- Furthermore, there do not seem to be many results available for puposes of comparison for Cases B and C when mi^M2- For X = l, our system perfor- mance measure of mean waiting time (queueing plus service) , is equivalent to the expected number lin the system. I CASE A: k stage queues in series, infinite interstage storage. (Graphs are labeled k Q for k- queues) . Steady state results for k stages in series with no correlation between pairs of service times are available [17] and we have that the expected number of customers at each stage is v/{l — v) and kv/{l — v) in the system. Figure 2 provides for two stages in series the mean waiting time at the second stage, Wn^\ for i'=0.75 and p = — 0.25, 0, 0.50, and 1.0. In this case the mean waiting time at the first stage is inde- Dendent of p since no blocking occurs, and hence it suffices to examine the mean waiting time at the 5econd stage to determine the effects of correlated service times. In some of the simulation results >o follow we replicate, many times, runs of much shorter length ; here we choose to illustrate the xiean waiting time as a function of n with one very long run. Long runs, such as this one, may be considered as being composed of replications of smaller runs where the starting condition of a new 102 C. R. MITCHELL, A. S. PAULSON AND C. A. BESWICK _(2) Wn A I I I I I I I I I L SYSTEM — (3, 24)/D=-. 25(6.31) — (3.05)/0=0 (6.04) — (2.78)o=.50 (5.79) — (2.40)/D=I.O (537) 10 20 30 40 50 60 70 80 90 100 " CUSTOMERSdN THOUSANDS) Figure 2. — Mean waiting time at second stage, 2Q, q= ^ , v = 0.75. replication is the ending value of the previous replication [6]. From the figure it is clear that each graph tends to stabilize for increasing n in accordance with the law of large numbers. The numbers adjacent to the values of p in Figure 2 are the mean waiting times for the second stage and mean waiting times in the system after 100,000 service completions. We point out that for p = the mean waiting times at stages one and two are 2.99 and 3.05 respectively, and these are in close agreement with the expectation of 3.0 for this utilization. For P5^0, we see here a bonus at- tached to positive correlation in service times since system performance improves with increasing correlation. On the other hand, system performance deteriorates with negative correlation. Each figure like this one has a starting condition based on the mean waiting time from a pilot run and then we omit the waiting times of the first 1000 customers in the actual computations shown. Figure 3 gives system performance for different values of v and for p = 0, 1. These graphs are intended to show that there is no discernable effect due to correlation p>0 at utilization ;'=0.6 but as V increases from 0.6 to 0.9 a definite trend appears. Figure 4 shows the ratio of mean time in the system for various values of p to the mean waiting time in the system at p = 0. These kinds of graphs are based on an average of 100 replications of 1400 service completions (after an initialization of 400 service completions were discarded). The solid lines in Figures 4, 7, and 9 depict a smoothed fit to the actual data. Sampling varia- tion, of course, precludes the possibility of obtaining such a smooth fit without extremely long runs or extensive replications, but each curve was spot-checked to ascertain whether or not the fit was spurious. In no case was any substantial deviation recorded. Now we show how these effects are consistent with Conolly's [4] results for the case of ^-=0.9 and correlation of p = 1.0 between the service times in the two stages. Conolly showed for his single server queueing system where the ratio of service time to the interarrival time was constant for all n, that for a utilization of 0.9 (the ratio) the mean waiting time (queueing plus service) was 2.71. For service time independent of interarrival time the steady state expectation, for this utilization, is 9.0. The interarrival time and service time in Conolly's system are perfectly correlated, whereas in our system the two service times are perfectly correlated. It is clear from equation (2) that the correlation between the interarrival time at the second stage and the service time there is less than one, and so the improvement in system performance for our system should be less than Conolly's (an elementary derivation shows the correlation to be vp or 0.9 in this case). We see from Figure 3 I CORRELATED EXPONENTIAL SERVICE TIMES 103 _(2) 10 9 e 7 6 ^ .■77(2) SYSTEM •(9.27)^ = (18.39) (d)i/=0.9 ^(4.I3)/>=I (13.17) J I I L _L J I I L 100 200 250 300 500 CUSTOMERS ( IN THOUSANDS) 5 4 3 h (b)V=0.7 a)v=0.6 SYSTEM (l.53 )/3=0(3.03) 2 (1.49)^ = 1(2.99) , J I L I W (c)i/ = 0.8 SYSTEM (23e)/0=0 (4.74) (2.05)/3=l (4.39) _L J u 100 200 O 100 200 CUSTOMERS (IN THOUSANDS) SYSTEM J4.I5);0=0 (8.21) (2.84)/0 = l(6.87) CUSTOMERS (IN THOUSANDS) Figure 3. — Mean waiting lime at second stage, 2Q, g= «>. O 100 200 CUSTOMERSdN THOUSANDS) I.I LlI > en u 1.0 U (- en > LlI •-0 .7 5< /J .3 -.2 .1 .2 .3 .4 CORRELATION V'.T I/=9 .6 .7 .8 .9 1.0 p Figure 4. — Ratio of mean waiting times, 2Q, 5= » . 104 C. R. MITCHELL, A. S. PAULSON AND C. A. BESWICK 7(01 I 'n 10 9 8 7 6 5 4 t J L J I I I I L 100 200 300 400 500 600 700 CUSTOMERS (IN THOUSANDS) 800 900 9.,. W^'' 4.90 W^^\. 608) -(4) — 4.72W (.624) 1^4.48 W^^\.657) ^^ —(2) 4.I5W""(.736) J I L 1000 n Figure 5a. — Mean waiting time, 5Q, g= «>, p=1.0, >' = 0.9. Z — - — 1=1 kf/d-J/) 1.40 1,20 1.00 .80 .60 </=7 u-.e u-3 10 15 20 SERVERS IN SERIES 25 Figure 5b. — Ratio of mean waiting time with p=l to expected mean waiting time with p = 0, 2Q-25Q, q= ^ (based on 100,000 service completions) . that the mean waitnig time at the second stage after 100,000 customers is 4.13, and indeed the improvement is less. In Figure 5a we show the mean waiting time as a function of n for five stages in series where the service times are equal at each stage. The graphs are labeled H^'*' corresponding to the mean wait- ing time at stage k, k=l, 2, . . ., 5. We see that the mean waiting time W^^\ for the second stage, is consistent with the results in Figure 3. The results for W^^\ W'^\ and TF<^* suggests that further improvements in system performance occur over the p = case but the eflfect seems to approach a limit. The number in parenthesis to the right of Vv'-"^ is the ratio i=l CORRELATED EXPONENTIAL SERVICE TIMES 105 Figure 5b shows this ratio for two through twenty -five stages in series for p = l. These results were obtained by extending recursive formulae (1), (2), and (3). In this extreme case of correlation, adding stages has an effect on system performance which depends markedly on the utilization rate; e.g., for j'=0.7 system performance is improved through the first four stages and then is reduced. A utilization of 0.9 gives rise to much improved system performance through twenty-five stages. CASES B and C: Two stage queues in series, finite (including zero) interstage storage. For these cases the utilization is effectively reduced in value [17]. The maximum effective utilization is j'max=(g:+l)/(2+2) where the queue in stage two is limited to a length of g— 1 units. We consider the cases q = l and q = 2. Figure 6 shows the mean waiting time at the first stage for q=l and several values of v. For this case each customer's waiting time at the second stage is simply the service time there so we are concerned only with the waiting time process at stage one. Figure 7 shows, for stage one, the ratio of mean waiting time at stage one with p 7^0 to the mean waiting time at stage one with p = 0. Steady state results for the mean number of customers in the system, L, for p = 0, 5 = 1 and with utilization v are given in Morse [15]; we have that (14) Z=4j'(2-j'^)/((2 + v) (2-3^)). For ^=0.4, 0.5, and 0.6 and for p=0, the observed (expected) values of L are 1.55(1.53), 2.87(2.80), and 7.80(7.57) respectively. The observed values are from Figure 6. For p9^0, again we see a dramatic effect in system performance. System performance deterior- ates as the correlation p increases through positive values and improves as p decreases through negative values. Hence when there is no storage allowed before stage two, the departure from in- II 10 9 e 7 {c)i/ = 0.6 SYSTEM (9.23)/3 = l (9.83) ■{7.20)p=0 (7.80) t J 1 I I I I L J. _L _L I J- 100 200 300 400 500 600 " CUSTOMERS (IN THOUSANDS) _(l) Wn I 3 3 (a)l/=0.4 SYSTEM2 . . _ (1.38)^=1 (1.78) (I.I5)/0=0 (1.55) 1 1 1 1 1 ^ 1 (b)l. = 0.5 SYSTEM 2 1 1 (2.37/3 =0 (2.87) 1 1 1 ^ 50 150 n CUSTOMERSdN THOUSANDS) { 50 150 " :USTOMERS(IN THOUSANDS) Figure 6. — Mean waiting time at first stage, 2Q, g= 1. 106 €. R. MITCHELL, A. S. PAULSON AND C. A. BESWICK 2 13 lijO . « 2- 12 I.I 10 - 9 - J L J l__J I I I 2 -I 12 3 4 5 6 CORRELATION I I » . 8 .9 10 Figure 7. — Ratio of mean waiting times at first stage, 2Q, 9= 1. 4 3 2 c)»/=0.70 (ll.3l)/3=0 (I0.04)/o = l ^(2) ri.l4/0=0 200 300 400 500 600 CUSTOMERS (IN THOUSANDS) 700 o)t/=0.6 j[3.37)/0= ~(3.l6)/3=0 W (2)_/-94/3=l ■\^I.I2^ = I .(7.84)/3=0 (749)/3=l 94/J=0 5 _ -(2) J "0/5=0 '^ 05/3=1 o 100 200 " O CUSTOMERSdN THOUSANDS) 100 200 300 400 500 " CUSTOMERSdN THOUSANDS) Figure 8. — Mean waiting time at first stage, 2Q, g = 2. IS dependence results in significantly different steady state behaviors, especially as the value v approached. Finally, we consider the case q=2. Figure 8 shows the mean waiting time as a function of n at the first stage and the ending value for the mean waiting time in the second stage. The mean waiting time at stage two was very stable for all values of n so those values will not be illustrated. The effect for i/ = 0.6 is in the same direction as for 5 = ^ but reverses as v increases so that for values close to j/^ax the change in system performance is consistent with the 5'= 00 case; that is, improve- ment for p>0 and deterioration for p<0. Figure 9 shows the effect in this case for p?^0. »li( liej CORRELATED EXPONENTIAL SERVICE TIMES 107 5. SPECTRAL ANALYSIS OF { W^<»} AND { W^''} In this section we analyze the time series { W^^ } and { Wn^ } and apply a nonparametric test to the ratio of certain estimated power spectra associated with them. Let {2,, i = l, 2, . . ., A'') be a realization of a stochastic process with mean n and autoco- variances 7*, k=l, 2, .... A study of a time series in terms of its autocovariances is referred to as a time domain analysis. Another type of analysis is concerned with the frequency content of the time series, namely spectral analysis [1, 8, 10]. The Fourier cosine transform of the auto- covariances 7o, 7i, 72, • • •, is called the power spectrum. Denoting the power spectrum by f(w), we can write (15) /(co) = - [70+2 X; 7* cos 2ir«il 0<w< V2 v\_ k=l J and inverting /(w) we can express 7^ as (16) 7*= /(w) cos 2irwWaj, k=0, 1,2 with 7ft estimated by where 1 ^-* _ c*=;^ X) (Zt+k—2){zt—z), When k—0 we obtain the variance 70 of the process as the integral of the power spectrum: (17) yo=£'f{w)dc.. Thus the power spectrum can be considered as a decomposition of the variance at different frequencies. To get sample results which are statistically consistent we do not estimate the spectrum at a particular frequency but instead estimate the average power about the frequency of concern. The average power corresponds to weighting the autocovariances in the time domain and we typically estimate /(w) with the truncated estimate (18) / (w^)=- XoCo+2 2 ^*Ct cos 27r«)A; where uj=j/{2m), j=0, 1, 2, . . ., m and the weights X*, k=0, 1,2,.. ., m, form a so-called lag window. We choose the Blackman-Tukey "hamming" window, (19) X*=0.54-|-0.46 cosTrk/m, k=0, 1, 2, . . ., m In (18), the sample autocovariances c^+i, c„+2 are omitted since, for m sufficiently large, they should contribute little information. As a result, only m autocovariances need be calculated and savings in computation may be considerable. Considerable care must be used when selecting 108 C. R. MITCHELL, A. S. PAULSON AND C. A. BESWICK m, however, because too large a value will increase the variance of the estimates and too small a value will not give enough resolution. Next we examine several sample power spectra associated with the simulated waiting times for the two-server infinite interstage storage case. We take the simulated values {Wi''' } , n=l, 2, . . ., N;i=l,2, to be time series where, as before, Wi^'' is the total waiting time, queueing plus service, of customer n at server i. Figure 10 shows a portion of the sample spectra for {WT'} and {Wn^}, n=l, 2, ., 2000, and for correlation values of p=0, 0.25, 0.50, and 1.0. Utilization, v, is 0.90. The 2000 sample values were chosen from the end of a simulation run of length 30,000 to ensure that any possible effects of start-up conditions were eliminated. After making several pilot runs, m in equation (18) was set equal to 400. For p = 0.50 and p = 1.0 in Figure 10 it is obvious that the waiting times at the second server give rise to different spectra than the waiting times at the first server. Since the integral of the power spectrum measures the variance of the process and the area imder the sample spectrum should be indicative of the sample variance, we see that the effect of positive correlation is to reduce the variance of the waiting time process. Again this is consistent with Conolly's results [4] for the single server system in which a customer's service is completely determined by the length of the interarrival interval separating himself and his predecessor. For a utilization of 0.9, Conolly's system reduces the steady state variance of the waiting time from 81, for the classic M/M/1 system, to 1.16; the sample variance associated with the waiting times at server 2 in Figure lOd for p=1.0 is 2.05 (the sample variance associated with the waiting times at server 1 is 60.1). Recall from Section 4 that the condition p=1.0 foT the correlation between a customer's service times at the two servers is equivalent to a correlation of v, or 0.9 in this case, between his interarrival time and service time at the second server. Therefore, the reduction in variance is consistent with Conolly's results since the corresponding correlation in his system is one. .2 -.1 .2 .3 .4 .5 .6 CORRELATION 7 .8 .9 Figure 9. — Ratio of mean waiting times in system, 2Q, q=2. l/=.68 1.0 p CORRELATED EXPONENTIAL SERVICE TIMES 109 lOOOO 1000 SERVER 2 5 * lOOOO .025 .050 .075 FREQUENCY 100 1000 100 .025 .050 .075 ^ FREQUENCY lOOO \'^ — SERVER I Li c)^=0.50 .025 .050 .075 '^ FREQUENCY .025 .050 .075 <^ FREQUENCY Figure 10. — A portion of waiting time spectra, 2Q, v = 0.9, q— <» , Next we develop a nonparametric test for the hypothesis that y'(a))=/'^Haj), 0<w<0.5, where /"'(w) represents the power spectrum at frequency w associated with the time series { W^'' }> n=l, 2, . . ., N; i=l, 2. The Blackman-Tukey "hamming" lag window in (19) gives rise to spectral estimates which are not independent, and so we employ the notion of equivalent independent esti- mates [10] which implies, for this window, that estimates are approximately independent if they are about 5/(4m) cycles apart. Since the estimates in (18) are separated by a basis frequency of l/(2m) cycles, this spacing of 5/(4m) cycles amounts to taking, as independent, those estimates which are separated by an interval of 2.5 times the basic frequency. Since the waiting times are not normally distributed and the assumption of normality is implicit in the development of equivalent inde- pendent estimates, we take this spacing of 2.5 times the basic frequency simply to be a rough guide. Actually, the normality assumption is more critical for making distributional assumptions about the spectral estimates than for the usage here. To select a practical spacing and to reduce any pos- sible effects of the normality assumption we take estimates at the frequenciesj7(2m), j = 1, 4,7,. . ., to be approximately independent (the spacing here is 3 times the basic frequency). Therefore, of the 401 estimates in each spectrum partially illustrated in Figure 10, we take 134 estimates at the fre- quencies j^/SOO, j=l, 4, 7, . . ., 399, to be approximately independent. Now for each approximately independent estimate we can regard the ratio /"'(w)//'^'('^) as a Bernoulli trial (greater than one or less than one) and under the null hypothesis of homogeneity of the two spectra we can take as a test statistic the number of ratios which are less than unity. Figure no C. R. MITCHELL, A. S. PAULSON AND C. A. BESWICK 20 40 60 80 100 120 140 INDEPENDENT ESTIMATES n(6.34) 20 40 60 80 100 120 INDEPENDENT ESTIMATES 140 Figure 11. — Ratio of sample power spectral estimates, 2Q, v==0.9, q= <». 11 shows the ratio for p = and p=0.25. Of the 134 approximately independent ratios in Figure 11a, 64 are less than unity and in Figure lib, 43 of the 134 ratios are less than unity. Under the null hypothesis, a ratio greater than unity is as equally likely as a ratio less than unity, and taking a normal approximation to the implied binomial distribution, we have a probability of 0.31 associated with observing 64 or fewer ratios less than unity in 11a and a probability of .001 associated with observing 43 or fewer ratios less than unity in lib. Although not illustrated, the results for p=0,50 and p = 1.0 are even more conclusive : for p=0.50, 23 of the 134 approximately independent estimates are less than unity, and for p = 1.0, none of the ratios is less than unity. Therefore, for the case presented here we have good statistical evidence that the power spectra associated with the waiting times at each server are not homogeneous for correlation p>0. We expect similar results for other values of correlation, utilization, and interstage storage to obtain. APPENDIX In this appendix we show how a set of recursive formulae for the waiting times can be con- structed for any number of queues in series where interstage storage is unlimited. Referring to the following diagram for c„ and c„+i's queueing and service times at stages two and three (a continuation of the diagram preceding equation (1) in the text), we see that c„+i's interarrival time at stage three is CORRELATED EXPONENTIAL SERVICE TIMES 111 Tn+1.3— I c( iOn+1.2, if r„.H.2<w^4^ w«' T^;: (3) CUSTOMER CUSTOMER Cn+l w; S n.2 W (3) Sn.-c U/(3) " n + l -' n+1. ,„(2) 'S'n+1.2 W • ^n+1.3~ (3) n + 1 'S'n+1.5 5n+1.3, ifr„+M>Pr<^> 3, Similar to equation (3) in the text, c„+i's waiting time at the third stage is "n+l — Comparing T'n+i.a and Tn+i.z we have, in general, for c„+i's interarrival time at stage i, %=2, Tn+\, i — s n+1, :-l> ifr„+,.,_.<w«-». Similarly, comparing Wnii, W^li and W^] 1 gives a general recursive formula for c„+i's wait- ing time at stage i, i=l, 2, . . ., TTTU) __['S'n+l. ti il Tn+l,i>Wn Thus, we can obtain the recursive formulae for any number of queues in series. REFERENCES [1] Anderson, T. W., The Statistical Analysis oj Time Series (John Wiley & Sons, Inc., New York, 1971). [2] Bhat, U. N., "Queueing Systems with First-Order Dependence," Opsearch, 6, 1-24 (1969). [3] Burke, P. J., "The Output of a Queueing System," Operations Research, 4, 699-704 (1956). [4] ConoUy, B. W., "The Waiting Time Process for a Certain Correlated Queue," Operations , Research, 16„ 1006-1015 (1968). [5] ConoUy, B. W. and N. Hadidi, "A Correlated Queue," Journal of Applied Probability, 6, 122-136 (1969). [6] Conway, R. W., "Some Tactical Problems in Digital Simulation," Management Science, 16, 47-61 (1963). [7] Downton, F., "Bivariate Exponential Distributions in Reliability Theory," J. Royal Statist. Society, Series B, 33, 408-417 (1970). f8] Fishman, G. S. and P. J. Kiviat, "The Analysis of Simulation-Generated Time Series," Management Science, IS, 525-557 (1967). 112 C. R. MITCHELL, A. S. PAULSON AND C. A. BESWICK [9] Jackson, R. R. P., "Queueing Systems with Phase Type Service," Operations Research, 5, 109-120 (1954). [10] Jenkins, G. M., "General Considerations in the Analysis of Spectra," Technometrics, 3, 133- 166 (1961). [11] Kesten, H., "Random Difference Equations and Renewal Theory for Products of Random Matrices," Acta Mathematica, 131, 207-248 (1973). [12] Kibble, W. F., "A Two-Variate Gamma Type Distribution," Sankhya, 5, 137-150 (1941). [13] Kohberger, R. C, "On Certain Multivariate Exponential Distributions," Ph. D. Thesis, Rensselaer Polytechnic Institute, Troy, New York (1975). [14] Krishnamoorthy, A. S. and M. Parthasarty, "A Multivariate Gamma Type Distribution," Ann. Math. Statist., 22, 549-557 (1951). [15] Morse, P. M., Queues, Inventories and Maintenance (John Wiley & Sons, Inc., New York, 1958). [16] Paulson, A. S., "A Characterization of the Exponential Distribution and a Bivariate Expo- nential Distribution," Sankhya, Series A, 35, 69-78 (1973). [17] Saaty, T. L., Elements of Queueing Theory with Applications (McGraw-Hill: New YorK, 1961). [18] Wicksell, S. D., "On Correlation Functions of Type III," Biometrika, 25, 121-133 (1933). SINGLE-LANE BRIDGE SERVING TWO-LANE TRAFFIC Z. Eshcoli and I. Adiri Technion — Israel Institute of Technology Haifa, Israel ABSTRACT This paper presents a mathematical model of a single-lane bridge serving two- way traffic in alternating directions (with an FIFO rule observed within each direc- tional queue). While the bridge serves cars moving in one direction, cars approach- ing from the opposite direction wait in a queue at its foot. When cars in the current direction finish crossing the bridge, it begins serving cars from the other direction, if any are present. A newly-arrived car finding an empty bridge mounts it immedi- ately. Several cars moving in the same direction may occupy the bridge simul- taneously. The crossing speed is assumed to be constant, and the arrival processes in both directions are assumed to be independent, homogeneous Poisson processes. A generalization of the alternating-priority models [1, 2] is developed to arrive at the Laplace-Stieltjes transform and the expected value of the flow time (the time interval between the moments of arrival at the bridge and departure from it) for steady state conditions. The results are discussed and some examples are presented graphically. 1. INTRODUCTION Cars arrive at a single-lane bridge in two opposite directions, numbered 1 and 2, according to independent homogeneous Poisson processes with arrival rates Xi and X2, respectively. A car going in direction i (i=l, 2), hereinafter referred to as "type-i car," may cross the bridge if the latter is empty or carrying cars of the same type; otherwise (i.e., if cars of the other type are crossing) it must wait in a queue at the foot of the bridge. The time interval between the moment when the bridge starts serving type-i cars and the moment when it carries none of them is called "type-i phase." At the end of type-i phase {i=l, 2), either the bridge is empty or type-j cars (jVn) have queued up at its foot. This queue, called the initial queue, now mounts the bridge in order of arrival. The crossing speed is assumed constant, hence the crossing time is also constant. This assumption is not unrealistic since usually the speed limit on such a narrow bridge is far below the free speed of modern cars. When there is no queue of type-i cars but the bridge still carries them, a newly-arrived type-i car does not have to wait, thus spending only the crossing time in the system. The crucial difference between the above described and an alternating-priorities queueing model ([1, 2, Chapter 9]) is that here the service facility (the bridge) can accommodate several 113 114 Z. ESHCOLI AND I. ADIRI customers (cars) simultaneously; accordingly, the term "service" will need a special definition, given later on. Our aim is to find the steady-state distribution and the expectation of the flow time (the time interval between the moments of arrival at the bridge and departure from it) of a car in the system as a function of the bridge's length (or equivalently the time spent crossing the bridge). Although our analysis is based on [1] and [2], these models may be derived as special cases of the model dis- cussed in this paper. The situation described above is not limited to the case of a narrow bridge, but is applicable to any one-lane road servicing two directional traffic, a common situation when repairing a road. Assume that a two-lane road of given length has to be repaired. The repair will be done by choosing one lane for traffic alternatively. The maintenance chief would like to do the work in one stretch to save set-up costs; on the other hand, this policy creates long queues and in turn high flow times. Having the necessary cost data, the results obtained may serve as a guideline in determining the optimal partition for the repair of the road. At the end a discussion of the model is given, and the expected steady-state flow time as a func- tion of the length of the bridge, for specific parameters, is presented graphically. Related models were studied by several authors. Darroch, Newell and Morris [3] considered a model in which a vehicle-actuated traffic light controls two intersecting traffic streams. The light is kept green for lane i {i=l, 2) until any existing queue of type-i caxs has been discharged (the "discharging time"), and further until a headway of duration at least /3, is detected in the subsequent arrivals (the "extension time"). The main difference between [3] and our model is that in [3] the discharging time and the extension time are independent random variables. Thus the light may change to green for lane-i even if there are no cars present in this lane, and this light stays green only for the extension time during which cars arriving from the intersecting lane have to wait even though no type-i cars are present. This case cannot happen in our model where a busy bridge is available for type-i cars iff type-i cars are crossing it. In another paper, Hawkes [4] assumed generally distributed crossing times and alternating priorities discipline. The expected waiting time of a type-i car was calculated and this result, obviously, coincides with the result in [1] and subsequently may be obtained as a special case of our model. Tanner [5] discussed a similar model in which the crossing times were also constants but the queue discipline was different: A type-i car could cross the bridge if there were no type-j cars (jVi) on the bridge and the last type-i car has started crossing at least jS^ time units ago. An explicit formula for the expected waiting time of a type-i car was presented only when /3i = or ^ = 0. 2. MATHEMATICAL MODEL 2.1. Basic Relations Let Yi (i=l, 2) be the mounting time* of a type-i car with the exception of a type-i car initiating a type-i phase which has F°, as its mounting time, and let Si be the constant crossing time of a type-i car. The mounting times are assumed to be non-negative arbitrarily-distributed random variables independent of each other and of the interarrival times, and possessing finite second moments. The crossing times are assumed to be finite positive constants. Let the arrival ♦Defined as the time interval between the moments two successive cars (present in the system) begin to cross the bridge. SINGLE-LANE BRIDGE 115 process of cars in direction i (i=l, 2) be a homogeneous Poisson with average of X^ cars per time unit. We denote: (1) X = Xi+X2, and (2) p,=\,E(Yd i=l,2, then under the above assumptions the system is non-saturated if: (3) P = PI + P2<1, see Section 4. At steady --state the system undergoes cycles of length Tc, the components of each cycle being an idle period Ta (the time period during which the bridge is empty), and a busy period T^ (the time interval between two successive idle periods) . Thus : (4) T,=T,+T. Two types of busy periods are observed: Tb, and Tbi where Tt,j{j=\, 2) starts with the arrival of a type-ji car to an empty bridge and terminates when the system is empty. Hence: (5) E{T,) = E{T:) + E{T,)=\+^ E{Tb)+^^ EiTb,). Following the notation in [2] T^} may be subdivided into subcycles of successive flows in alternating directions denoted by 2\^, (6) Tbi=f:T,„ i=l,2. Every subcycle T^j (except when k=j=l) comprises two successive phases of flow and counter- flow as follows : -111=2^111 (7) r„=Tj.,_,.i+ria, ^>i where Ti^j is the k-ih phase of flow of type-i cars in a type-^ busy period. Thus, the two types of busy periods may be described as follows : Type-1 "bases: Tuj | i2n I ?^i2i I -t22) I -tiai I • ■ • I 3^2, t-i. i I ^ui I • • • Subcycles: Tn \ T,, | T31 | | T,, < T,^ 116 Z. ESHCOLI AND I. ADIRI Type-2 Phases: T212 I ^'112 I ^'222 ^122 I • • • I ■ lk2 Subcycles: T-i^ Each phase Ti^j may in turn be divided into two subphases: (8) Ti^j=Tii^j-\-I ikj, I, J = l, 2; K^\, 2, . . ., the first of which, TH), begins when the type-'i car initiating the phase mounts the bridge and terminates when there is no queue in direction i (there are type i cars on the bridge) ; the second, Tl^ki, immediately follows the first and terminates when there are no more type-i cars in the system, (in queue and crossing the bridge). In the first subphase the "service" consists in mounting the bridge, so that the service times for type-i cars are independent r.v. distributed as Fj except for the initiator which has Y° i as its service time. In the second subphase newly-arrived cars mount the bridge without waiting so long as there are cars of the same type on the bridge. The "service" thus consists in crossing the bridge, so that type-i cars have a constant service time denoted by Si. Note that (S'^(^ = 1, 2) is determined by the length of the bridge and the crossing speed limit, the latter being lower, by assumption, than the free speed of type-i cars. Let Wi{i=\, 2) be the flow time of a type-i car (i.e., the time from arrival until departure), and W the flow time of an arbitrary car, then : (9*) L^{z)=^Lw,{z)+^LwAz). Due to symmetry, Lwi{z) may be derived from Lwi{z) by changing indices, thus without loss of generality, only type-1 cars need be considered. A type-1 car arrives at either an empty or a busy bridge. In the first case its flow time is Si ; In the second, let U^j be its flow time if it arrives in T^j{j=l, 2; k = l, 2, . . .,). Hence, following [1], we have : h.tL{-lc) [ k = l k = l I In the next section the L.S. transform and the expected value of a phase are found which yield (in view of (7)) E{T^,) and E{T^2)- Following in 2.3 E{Tc) is computed and finally in 2.4 we turn to find Lu.iiz). Combining these results, using (10), yields the L.S. transform of the flow time of a type-1 car. *For a non-negative r.v. X having c.d.f. Fx(-), Lx{z)=j\-"dFx{x) denotes its Laplace-Stieltjes (L.S.) transform. SINGLE-LANE BRIDGE ' 117 2.2. Length of Phase Consider T^i, k^l. Clearly ri|'i=0 whenever T^]^,=0 so that for Ar>l we have: where Ti\\ and T/^'i are defined and discussed by equation (8) and its sequel. If the first component, Ti"i, is positive it can be treated as a busy period in an M/G/1 model where the service time is Yi, and the service time of the "first customer" being the time required by the initial queue of type-1 cars, formed during T2.k-i.\ to mount the bridge. Therefore. (12) Ti\\=±:t„ i = l where ti is the mounting time required by the initial queue and tt (t>l) is the time required by the cars arrived during ^_i to mount the bridge. By well-known results ([2, p. 151]) : (13) Z^m (2)=:i,,(2 + X.-XiLi(2)), k>l where (14) Zi(2)=Zy,(2 + Xi-XiZ,(2)), Let N2,k-\.i be the number of type-1 cars arrived during Ta.^-n then: iLY^{z)[LYXz)r-\n>0 (15) Eie-''^\T2.k-i.x=t,N2.k-i.,=n): .^ ^. [1 J TV — \j Withdrawing the conditions in (15) we have: (16) ij2)=in.-,,,(Xi)+^r^ (ZT...,..(Xi-X.iy,(2))-iT..»-,.,(X,)), ^>1. Substituting (16) in (13) we obtain: (17) Z^(n(3)=ZT.._,.XXi)+:^^^^^^^^^t^g^^ k>\ where 2.1(2) is given by (14). As for the second subphase in T^^i '• let and let M^'i be the number of type-1 cars arrived during Ti^'i. Denote by t„i, m>l, the time elapsed between the moments of arrival of the (m-l)-th and the m-th car in Z","']. t^x are independ- ent r. v. distributed as ti. ti<S'i because otherwise Zjl'i would terminate. Hence the densit}' func- tion of Ti takes the form: (18) " /n(«)=l^^/ Q<t<S. 118 Z. ESHCOLI AND I. ADIRI A type-1 car arriving at the system during Til'i could cross the bridge iff the preceding type-1 car were still on it — an event which occurs with probability 1 — e~^'^', hence: (19) P(M,Tl=w)-=(l-e-^'^')"e"^'^ n>0; k>l. Now for k'^l, (20) 7'(2) _J MiTi m = l Su MfA=0. Expressing (20) in terms of L.S. transform we arrive at: Xl + 2 ^^iTi(^)= X:+36^^-+-)^- ' ^>^- Af' ^' -1 ■'" in ^ -?^iii = -riii = Oi+ ^ Tml, m=l (21) When /: = ! it is true that: (22) but (23) P(M,Ti=n) = (l-e-^'^")"-ie-^>^', n=l, 2, . . .,. (It is clear from the definition that at least one type-1 car appears during Tm.) Therefore: Xl + 2 (24) iT,„(2) = Xi+2e(x.+z)s, When Ar>l, Tui=0 iff in the preceding phase, T2,ic-i.i, there were no arrivals of type-1 cars. The probability of this event is : (25) (T, -'^L e-^"c^Fr,,.,,,(0=in»-...(Xi). k>l By (11), (25) and the independence of Ti'^i\Ti\\>0 and T.Ti: (26) S(e-^.'0=Zr..-,.XX,) + (l-LT..-,.XXi))iT<?,(2)£:(e-^^iTi|r.'i\>O).Ar>]^ But (27) i?(e-^iTi|Tffi>0) = J^" ^"^'«'^T<»,|^<i\>o(0 = J^° c-'(i 1— P(Tn,=0)J Substituting (27) in (26) yields: (28) iru,(2)=i^n.-,,,(Xi)+iTf,',(2)(ZT<i'^(0)-LT.,....,(Xi)),^>l. From (17), (21) and (28) we finally obtain: (29) i'T„i(2)=I'r2.»-,,,(Xi)- Xi + 2 Zyo^S + Xi — XiZ,(2)) Xj + 2e(X.+^)S, i:(3) (-2'n.-,,.(Xi-XiL,(2))-ZT...-..,(Xi)),A:>l, SINGLE-LANE BRIDGE 119 where Zi (2)= Ly, (2 +^1—^1^1 (2))- Similarly, by symmetry, we have: (30) Lt,.M)=Lt,.M+ ^^_^'J;,1.,s^'^'^'^^^^^^ k>l, where i2(2)=iK,(2+X2 — XzZzCs)). Theoretically, LritM) is obtainable recursively from the above relations: Lrmiz) is given by (24); substituting Lrmiz) in (30) for k=l yields Lrmiz)', substituting Lrjui^) in (29) for k=2 yields Lt,,,(z), etc. Lrn^iz) are obtained by change of indices in Lr.tii^)- Differentiating (24), (29) and (30) with respect to 2 at 2=0, yields the expected value of the length of a phase : (31) E{Tni)- X, 1— Pi \ i— Pi Xi / (33) E{T,,,)^j^- g(rui)+( ^^-r°^~-^^^ +^^^) (l-Zru,(X.)), A:>1. 1— P2 \ 1 — P2 X2 / Solving the set (31), (32), and (33), we have: (34) E{T,,,)=E{Tu.)r'-'+( '^^\ ' '^ +^— ^) S ^^-V"" (l--^r..,(XO) \ 1— Pi A] / m=l 1 — Pi \ I — P2 A2 / m=l and (35) E{T^,}=-^ EiTndr'-'+j^^ / E{T\-YO _^e^^\ g ^_i_„ (i_2:^^__(x0) i — P2 i — P2 \ 1 — Pi Ai / m = l \ 1 — P2 ^2 / m = l where P1P2 (1-p,) (I-P2) and £'(r„,) is given by (31). Substituting (31), (34) and (35) in (7) yields the expected length of the subcycles E(Tt\), k>l. Thus, to find the Laplace-Stieltjes transform of the flow time of a type-i car, it is left to find EiTc) and Lv^^iz), ?:=1, 2; k=l, 2, . . . ., In the next section we calculate E(Tc). 2.3. Expected Length of Cycle Let Nij (i, j=l, 2) be the number of type-t phases in type-j' busy period. The distributions and expectations of these r.v. were found in [2, p. 199]: (36) P(Nu=l)=P{Tr,,=0)= j\-^^^dFT,At)=LT,A^x), and 120 Z- ESHCOLI AND I. ADIRI and (37) P {Nn=k) =P {Tui>0, n ,+,. 1=0) = P (T,, ,+,. ,=0)-P(Tux=0) =Lr„, (X,) -ir. »-,. , (M , ^>1. Hence: (38) EiNn) = l + t: (1-Ztu>(Xi)) k=l and similarly: (39) ^(A^2.) = f:(l-XTu,(A2)). fc=l The expected values of N12 and A7^22 are obtained from (38) and (39) by change of indices. From (31), (32), (33), (38) and (39), we have: (40) g E(T„.)=f^|4^+(^(^^+4^).£(iV,.-l) 1— Pi \ I—P2 A2 / J Hence : (42) Ein,)^-=± [E{T,,,)+E{T,,,)]=^ U E{Y\-Y,) _^e^^^\ ^^^^^^ k=\ 1 — P [\ 1 — Pi A] / +E(Y'.-Y.m^^..) (^<M)+^) E(N„) j. E{Tb^ is obtained from (42) by change of indices. Finally from (42) and (5) : (43) E{T,) = \+^j^^ {Xi(l-p,) |^ g(F°i-F.) ^g^^l^ E{Nn)+\,E{Y,-Y°,) +x.a-PO (^^p^+q=:^) ^'(A^.)+Xi(i-p.) (^ifa^+^^^ \ l — Pl Xi / \ 1— P2 X2 / •£:(iV2i)+X2(l-P2) ( ^^"^X^'^ ^^^^) E{N,,)+\,E{Y2-Y°2)\ where EiNij), i, j=l, 2, are given by equations (38) and (39). Substituting E(T,) and the previous results in L^iiz) (equation (10)), it still remains to find the L.S. transform of the flow time of a type-1 car in a subcycle. This is done in the next section. SESTGLE-LANE BRIDGE 121 2.4. Flow Time Consider The flow time of a type-1 car in Tfl^ is -Si. Let U'ki A:>1 be the flow time of a type-1 car in (44) r=r2.*-i.i+r,Ti Tmay be treated as a delayed busy period in an M/G/1 model where T2,k-i.i is the delay interval and TIW is treated as in (12). Using known results for the flow time in an M/G/1 model where the busy period is delayed and the first customer has a different service time ([2, p. 153]): (45) Lv'.Xz)= £(n,_,:+r,T0(Xiir.(2)-x.+2) ''^^■ By (11) and (25) we have: (46) ZuM(3) = ^^^^'^-'y^^^''"'^ Lt7'»(3) + (l-i:r.>-,..(X.)) ^^e-'"> k>l. Substituting (45) in (46) yields: e-S'^ fiy,(2)(l-^T..-,,.(Xi))-i:y»,(2)(LT,...,.,(2)-iT,,.-,..(X,)) (47) Zu.(3) = -^(y^^) ^ X^LyXB)-X^ + Z +^^(l-^r..-,.,(Xi))[,t>l and similarly: n,{lk2) [ \iivy,(2) — Xi + 2 \\ ] By definition: (49) Z[;„(2)=e-^'^ Substituting (47), (48) and (49) in (10), we fuially obtain the Laplace transform of the flow time for a type-1 car: (50) iH',(2)=r^^ |1+'-^^ {\E{,Nn)+\2E{N,,)) + ^-. , ] , , ^ ±;[^F.(^)(Xi(l-i:ru.(Xi)) A£L{1 c) I Ai Alivy,(,2)— Ali-2 t = i +X2(l-ZT,.,(X0))-iy°,(2)(Xi(LT,„(2)-LT.„(Xi))+X2(Zr.».(2)-i:r,.,(X,)))]} Differentiating (50) with respect to z at 2=0 we obtain the expected flow time of a typc-1 car: (51) E(W,)=S,+^^^^j~^^^-—^-, {(1-p,) ±[X,E{TU+\,E{Tl,,)]+{\,E{Yr') +2(l-p,)E{Y°,))±,[X,E{T2Hi)+X2E{T,,,)] + {\,E{Y,')E{Y°,-Y,) + {l-pi)EiY''rr-Y^)){\,E{Nu-l)+\2E{Nn))] 122 Z. ESHCOLI AND I. ADIRI Now »=i is given by (41) ; k=l is obtained from (40) by change of indices; differentiating (30) twice with respect to z at 2=0, we have E{Tl)^i); differentiating (29) twice with respect to z at 2 = and changing indices, we have E{Tl,c2)- These results are to be substituted in (51) to yield the expected value of the flow time of a type-1 car. 3, SPECIAL CASES 3.1 FYoAy)^FYXy), 1=^1,2. We assumed that the first type-i car (i=l, 2) in each subcycle has a different mounting time. This is usually the case in real life. However, assuming that the mounting times of all type-i cars are identically distributed simplifies the equations and may provide adequate approximations. Hence replacing Y'°i by Yt (i=l, 2) in (40) and (41) and changing indices in (40), we have: (52) i:i?(r,.,)= »-'';'"-'"> 1^ 5^ E(N„)+'^ £w,) fc=l 1 — P li — P2 Ai A2 (53) i:£(r,.).= <'-7>"-'"M '^t^£(A'.)+,-''^-^g(A-,.) )c = l 1 — P I A2 1 — P2 Al Differentiating (24), (29) and (30) twice with respect to z at 2=0 we obtain difference equations for the second moments of the phase length. Summing these equations we obtain two independent equations for k=l and k = l whose solution is : (54) |:^(r..)^^,{(^j^^.^)S.(n.,)+ ---n7"^''-' ^''^'--'l (55) ^-rb(iJ-JI(^,'gg^a-.^)S^<^->^ ""^''":7'"^'^-' ^'^-">l- SINGLE-LANE BRIDGE 123 Replacing F°< by Y( (t=l, 2) in (51), we obtain: + {\,E(Y,')+2{l-pr)E{Y0) S i^.E{T,,,)+\2E{T2,2))\ /i = l Now is given by (55) ; k=l JlEiTh,) k = l is obtained from (54) by change of indices ; ±E{Tu.) k = l and S E{ T2k2) m = l are given in (52) and (53). Substituting these values in (56), we have the expected flow time of a type-1 car in terms of the known parameters of the system. £^(^2) is obtained from (56) by change of indices, and E{'W) is obtained from (9). 3.2. Alternating Priorities If the crossing times are negligible then the problem of simultaneous service disappears and the only service that cars receive at the bridge is mounting it. Hence substituting <S'i=0, i=l, 2, in the above discussion we obtain the known formulas for the L.S. transform and the expected value of the flow time under alternating-priorities rule ([1, 2]). Furthermore, the case of alternating- priorities with set-up times may also be derived if the first service in each phase, Y°i, i=l, 2, is decomposed into a sum of the (independent) set-up time and the "ordinary" service time. 4. DISCUSSION 4.1. Non- saturation Conditions The distribution of Tj'^'i ^>1 is independent of k (equation (21)). In the same manner it can be proved that T'^^,, Ar>l, is independent of k and j. Define: (57) F°i=F°i+ r«> , F°2= r°2+ r«> With this definition, ignoring the first and last phases in each busy period, our model becomes . identical, as far as saturation is concerned, to an alternating priorities model in which the service time for the i-th priority class (i=l, 2) is distributed as F,, except for the first customer in a phase whose service time is distributed- as Y°i. (Obviously, the first and last phases in each busy period and the different service time of the first customer do not aflfect saturation.) Hence the non-satura- 124 Z. ESHCOLI AND I. ADIRI tion condition for our model is the same as that for the above described alternating priorities model ([1, 2, Chapter 9]), namely the condition stated in (3). 4.2. Mounting Times When T^j ^>1 starts the queue of type-i cars formed at the foot of the bridge during the previous phase mounts the bridge. The mounting time of a car was defined as the time elapsed between the moments when two consecutive cars present in the system begin to cross the bridge. Clearly, the mounting time is the time needed to pass a distance equal to the length of a car and a minimal safety distance between two consecutive cars. (In the subcycles Tlfji, j=l, 2, k>l the minimal safety distance is not necessarily kept.) The mounting time of the first car in a phase does not share the same distribution because the first car has some additional preparations to make before mounting the bridge and here no overlapping activities are possible. 4.3. Graphical Representation It is assumed, for simplicity, that the two priority classes differ only in their arrival rates, i.e., they have the same crossing time S, and their mounting time is distributed as Y including the first car in a phase. Furthermore let us denote : A = bridge's length <^= crossing velocity (constant) c= car's length (assumed to be uniformly distributed on the interval (3, 5)) D = safety distance )/'= mounting velocity (constant) Distances are measured in meters and the unit of time is a minute. The crossing velocity was as- sumed to be constant, as a direct consequence the crossing time S=A/<I) is constant too. Assume further that \p is constant, then in view of 4.2, Y must be proportional to c-\-D and in fact, equality was assumed. Figure 1 shows the behaviour of the expected flow times E{Wi) and E(W2) as a function of the bridge's length — A, with Xi = 5 cars/min., X2 = 10 cars/min., D = 3 meters, i/'=150 meters/min., (/)=500 meters/min., £'(F) = 0.47 min., p=.7. Since Xi<^X2, it takes more time to clear the bridge of type-2 cars, thus accounting for the greater expected flow time of type-1 cars. SINGLE-LANE BRIDGE 125 50 100 BRIDGE'S LENGTH (IN METERS) Figure 1. — Expected flow time as a function of the bridge's length. REFERENCES [1] Avi-Itzhak, B., W. L. Maxwell and L. W. Miller, "Queueing with Alternating Priorities," Operations Research, IS, 306-318 (1965). [2] Conway, R. W., W. L. Maxwell and L. W. Miller, Theory of Scheduling, (Addison-Wesley, 1967). [3] Darroch, J. N., G. F. Newell and R. W. J. Morris, "Queueing for a Vehicle-Actuated Traffic Light," Operations Research, 12, 882-895 (1964). [4] Hawkes, A. G., "Queueing at Traffic Intersections," Proceedings, Second Symposium on the Theory of Traffic Flow, London (1963). [5] Tamer, J. C, "A Problem of Interference Between Two Queues," Biometrika, 40, 58-69 (1953). OPTIMAL CONTROL FOR MULTI-SERVERS QUEUEING SYSTEMS UNDER PERIODIC REVIEW* C. C. Huang, t S. L. Brumelle and K. Sawaki University of British Columbia Vancouver, B.C., Canada I. Vertinsky International Institute of Management Berlin, Germany ABSTRACT This paper deals with the prcblem of finding the optimal dynamic operating policy for an M/M/S queue. The s:^stem is observed periodically, and at the be- ginning of each period the system controller selects the number of service units to be kept open during that period. The optimality criterion used is the total dis- counted cost over a finite horizon. I INTRODUCTION Most of the related studies reported recently in the literature focused upon controls of one- server queues over infinite horizon (Heyman [3], Bell [2], Sobel [5], Balachandran [1]). Zacks and Yadin [6] dealt with the case of finite horizon for M/M/1 sj^stem with variable service intensity under non periodic review. In their model decision epochs occur immediately after changes in queue size. They identify an "optimal" policy resorting to a conjecture and imposing restrictive assumptions. Magazine [4] studied M/M/S queues with finite waiting capacity under periodical review and convex increasing holding costs over finite ho^zons. All the results in Magazine [4] are derived on the ba is of (1) a misspecified holding cost function which ignores the fact that customers may be turned away when waiting room capacity is saturated, and (2) an incorrect argument that the distribution of number served by the i'" server in an m-server system is identical to the distribution in an n-server system where n^^m. This paper considers a similar model to the one outlined by Magazine with two major modi- fications : (1) Waiting room capacity is taken to be infinite. (2) The cost structure is generalized to permit different holding costs in different periods. *The research reported in this paper was supported in part by the Defence Research Board of Canada and !The International Institute of Management, Berlin. tCurrently affiliated with Memorial Universitj' of Newfoundland, St. John's, Newfoundland, Canada. 127 128 C. C. HUANG, S. L. BRUMELLE, K. SAWAKI AND I. VERTINSKY MODEL FORMULATION System Structure (i) Assume there are s servers in parallel. (ii) Assume a Poisson arrival stream. (iii) Assume that the decision points are at equally spaced time intervals. Without loss of generality, we assume that these points are 0, 1, 2, .... (iv) Assume that the service times have independent and identical exponential distributions and are independent of the arrival process. Cost Structure Let aSC^I/:) denote the cost of changing from k to I open servers at a decision epoch, and then operating the I servers until the next decision epoch. We assume that S(l\k) has the following properties : (i) S(l\k) is convex in I for each k ; (ii) For fixed j and I with Z<ji, the function Sij\k) — S{l\k) is nonincreasing in k, and equal to S(j\k-l)-Sil\k-l) for k<l or k-l>j. For example, we might take S(l\k) = (l-k)+A+(k-l)+B+lC where A is the cost of opening a closed server, B is the cost of closing an open server, C is the cost of i operating an open server for one period, and i A:+ = max (k, 0). It is easy to check that S{l\k) of this form satisfies (i) and (ii) . In addition to the above switching and operating costs, we include a customer holding cost. Let Kn(i) denote the cost incurred during period n if it ends with i customers in the system. Markov Decision Structure Define 0„(i, a, k) as the minimum expected cost with i customers present, k servers open and n periods remaining in the horizon, using discount factor a, 0<a<l. We take the length of time be- tween decision epochs to be the unit of time. The recursive relationships are where 4>n{i, OL, k) = mv[v4/„{l\i, a, k) n=\, 2, ■ ■ • , N Q<l<s ^l^M\i,a,k) = Sil\k) + ^P,j'[K„{j) + a<l>„-,{j,a, l)],Ui,oc,k)=0, Z„ is the number of customers in the system at the beginning of period n, and Pij^=[Zn=j\Z„+i = ir\l servers are open]. QUEUEING SYSTEMS UNDER PERIODIC REVIEW 129 Policy Structure For each period index n and k=0, 1, 2, • • •, s define I„{l\k)^{i:<i>n{i,a,k)=Ml\i,a,k)], 1^0, 1, 2, - • ■ , s. Define a policy tt by ir(n, i, k)=l if ieln{l\k) with the interpretation: If there are k servers open and i customers present at the nth decision epoch and iel(l\k), then change the number of servers open to I. The optimality of the above poUcy is obvious, since for any n, k and i, one can evaluate ^„(^|i, a, k) for 1=0, 1, • • •, s and assign the i to the appropriate set In{l\k) which by definition will yield the minimum expected cost. It is reasonable to expect that for fixed n and k, the In{l\k) are disjoint intervals. In this case, the policy can be specified by a nondecreasing sequence of control limits io*{k, n)<ii*{k, n), ■ ■ ■, <is*{k, n)<it+i(k, n) with the interpretation that if there are k servers open with i customers present at a decision epoch with n periods remaining and i*{k, n)<i<^i*+i{k, n), then change the number of servers open to I. In this case we define the policy tt by (1) ir{n, i, k) = l if i*{k, n)<i<^i*+i{k, n) i . . . * The limits i*i{k, n) can be obtained recursively from the set functions l„il\k) as follows: set ^ it+iik,n) = + CO ; and for j=s, s— 1, . . ., 0, set ij*(k, n)=i*+i{k, n) iil„(j\k)=<f) and otherwise set i*(k, n)— min {i: ielnij\k}. A similar policy was suggested by Magazine [4]. In the next section of the paper we show that an opitmal policy of the above control limit type exists. The Control Limit Form of the Optimal Policy Let Z„ and Z„ respectively denote the number of customers in the system and the number of servers on at the decision epoch with n periods remaining. It is convenient to model the {Z„, Z„} process in the following way. Define -X^n={Tnl) ''■„2) 7-„3, . . . ,' 0„i, iS'„2, Sn3, ■ ■ ■,} ■0 be a sequence of independent random variables with each r having the exponential interarrival listribution and each S having the exponential service time distribution. k r=max [t:i:r„,<l]^ 130 C. C. HTJANG, S. L. BRUMELLE, K. SAWAKI AND I. VERTINSKY then r customers will arrive during period n at times measured from the beginning of the period. The services times are S„i, S„2, ■ ■ ■, S„r for the r customers who arrive during the period and times Sn,r+1, Sn,r+2, • • • , <Sn r+Z, for the Z„ customers in the system at the beginning of the period. We also assume that { X„ : n= 1 , 2, 3, • • •, } is a sequence of independent vectors. Then Z„_i is a function of Z„, Z„, and Xn for a given policy v, i.e. for each policy tt, there is a function f such that Z„_i=/(Z„, L„, Z„)ri=l,2, ...,. We assume that during period n, customers have priority in the order that the service times are listed in the vector X„. Thus the customers who arrive during the period have priority over those customers present at the beginning of the period. The priorities are preemptive, and customers who are preempted resume their service at the point it was preempted. Thus the process representing the number of customers in the sj'^stem is modelled somewhat differently than usual, since we choose new service times for all customers in the system at a decision epoch, even if they are in service. In addition, the queue discipline is not the usual one. However, since the service times are exponential, the probability distribution of the stochastic process (Z„} will be the same as in a normal M/M/S system. That is P,j' is the same for our system as for the usual first-come first-serve M/M/S sj'^stem. Consequently, any result that depends only on Pj/ will be true in both our system and the usual system. In particular, all results except for Lemma 1 hold for the usual M/M/S system. In the following discussion, the period index n will be clear from the context and will be dropped for notation al convenience. Lst Ni(X\i) be the random variable denoting Z„_i given tLat ir(Z„, Ln)=l and Z„=i; that is Ni(x\i) is the number of customers left at the end of the period, given that i customers are present at the beginning of the period, that I servers are open during the period, and that Xn = x. The following discussion uses the difference notion A/(y) ^ y(y+Ay)-/(y) Ay Ly LEMMA 1: For any realization x of X, Nr„{x\i-^u)—Nn'{x\'^) is non-negative and nonde- creasing in i for m'>m and w>0. PROOF: For any realization, we have AN^{x\i+u) ^^^ ANr,.{x\i) Ai Ai are either or 1. To see this, note that the additional customer has the lowest priority, and so does not affect any of the other customers. The number left at the end of the period increases by one if the additional customer does not complete service, and remains the same if he does. It is equally clear that Nm'{x\i+\)=NUx\i)+\ QUEUEING SYSTEMS UNDER PERIODIC REVIEW 131 implies N,nix\i+u+l)=NMi+u) + i- This gives ANJx\i-\-u)^AN„,(x\i) Ai A% LEMMA 2 : Assume that A'K(i) At - >0. Then for m' >m, is nondecreasing in i. PROOF: For X= X, we have [pt^-pt;]K{j) [PT^-PT;]KiJ) = E[K{N^iX\i))-K{N,,,iX\i))]. AK{NMi)) ^ ^K{y) Ai Ay AK{y) > Ay AK(y) - Ay > AN„ix\i) i/=jv„(i|i) Ai AN^.(x\i) y=N„(.x\i) Ai AN,,,{x\i) V=N„'(.z\i) Ai AK{Nr..{x\i)) ^ Ai where the first inequality follows by Lemma 1 and the second, since Nm{x\i) >A^m'(a;K) and Ai' -"• Thus K{Nm{x\i)—K{Nn'{x\i)) is nondecreasing in i. Hence the expectation is also nondecreasing in i. THEOREM 1 : Assume that Then forn= 1 the optimal policy is of the control limit form given by display (1) . PROOF: Recall that 4>x{i, a, k)— mm\pi{l\i, a, k). 0<l<) 132 C. C. HUANG, S. L. BRUMELLE, K. SAWAKI AND I. VERTINSKY We prove the theorem by showing that \pi(l\i, a, k)—\pi{l-\-l\i, a, k) is nondecreasing in i which imply that ipi {l\i, a, k) intersects yf/^{l-\-\\i, a,k) at most once and from below. The above difference can be written as [Sil\k)-S{l-V\\k)]-\-±,{P,^-P'^')Ky{j). ;=o The first term is constant in i and the second term is nondecreasing in i by Lemma 2. LEMMA 3 : Assume that Then A%(i, a, k) AiAk >0. <0. PROOF: Let <t,S, a, k)=Sik^\k)+^ Pu'^K^iJ) ;=o Mi, oc, k+\) = S{h\k-^\)+±, Pij'^K.iJ) <i>S + l,a,k) = Sih'\k) + ±,Pl:l,jKr{j) J = <{>S + l,a,ki-l)==S(h'\k+l)+±,PlUjK,{j) 7 = First we show iheitk2>ki and k2'>ki'. Since <t)i(i, a, k)<\l/iil\i, a, k) for each/, we havo S{h\k)-S(l\k)<± iP,,^-P,,'^)Kr(j). i=o li l<ki, then by assumption (ii) of the cost structure S{h\k+l)-S(l\k+l)<Sih\k)-Sil\k) Consequently, S(k^k-\-l)-Sil\k+l)<±iP,/-P,j'^)K,(j). Regrouping terms provides x(^i(h\i,a, k+l)<Ml\i,a,k-\-\) ior l<h. This expression demonstrates that k2>ki. A similar argument shows that ^2' >^i'. From Theorem 1 we also know that ki'>ki and k2'>k2. The proof proceeds by fixing k and considering three cases. 1. Suppose i>i*k+i{k). By definition of </)i(i, a, k) we have S{h\k)+±, Pu'''Kr{j)<S{l\k) + ± P,/K,(j). ;=0 ;=0 QUEUEING SYSTEMS UNDER PERIODIC REVIEW 133 From Theorem 1, we know that^i >/:+!. So the above inequality implies ;=0 j=0 for I'^ki, and thus k2<ki. However, we showed earlier that k2>ki. Therefore ki = k2. In this case, ^<f>S, J, k)=S(h\k+\)-S{h\k) and 2. Suppose i<Ci*k+i (^4- 1) • In this case a similar argument shows that again ki = k2 and ^ci>,(i, j, k)=S(k2\k+l)-Sik2\k). Thus ^</..(t,i,A:)=0 3. Suppose i*ic+i{k-j-l)<i<Ci*!c+iik). From Theorem 1, we have that ki<k, k\<k, ^2>^, and k2^k. Combining these inequalities with those verified at the first part of the proof provides k,<k,'<k+\<ik2<k2'. Since k2>ki' <l>S,a,k)-cl>,ii,a,k+l)=S{k,\k)+j:P,j'^K,{j)-S(k2\k+l)-j:P,j'^K,{j) j=0 ;=0 <S{h'\k) + ±,P'i';K^(j)-S{k2\k+\)-±,P,j'^K,{j)<S{kr'\k) + ±P'iUjK,{j)-S{k2\k+l) j=0 ;=0 ;=0 -i:PiU.jK^U)<Sih'\k)-h±PtUjK,{j)-S{k2'\k+l)-±P^UjK,ij) j=0 7=0 ;=0 = <)!)i(i+l, a, k)—<l>i{i + l, a, k+1), where the middle inequality follows from Lemma 2. Thus for i*k+iik+l)<i<Ci*k+i(k), which concludes the proof. LEMMA 4 : Assume that for n=0, 1 and all k=0, 1, • • •, s. Then for m'>m, S {P,nK2U)+cx4>i(j, «, m)]-P,r [^2(i) + a0i(i,a, m')]} ;=o s nondecreasing in i. 134 C. C. HUANG, S. L. BRUMELLE, K. SAWAKI AND I, VERTINSKY PROOF: i: {Pir[K2{j)+acf>,(j,a, m)]-P,r[K2U)+aMJ,a, m')]} =E{[K2iN„{X\i))+a<l>,(N„(X\i),a,m)]-[K2(N„>(X\i))-\-aMNrn'(X\i),a,m')]] For X=x we have A[K2{NM\i))+ciMNJx\i). a, m)] r AK2iy) ^^ AMy, «, m) I "I ANM\i) Ai L ^y y=N„(.x\i) Ay \y=N„(.x\i)j Ai FAKiiy) I , A<t>i{y, a, m) \ "J ANr^.{x\i) rAKiiy) I A<^i(y, a, m) I l ^ 1 r« 7 L % |y = JV„(i|i) Ay \y=N^(.z\i)J ^r AK2(y) I ^ ^ A«^i(y, g, m) I n AA^^>(a;|t) ^f ""L ^y \y=N„'(.z\i) Ay \y=N„'{z\i)J Ai ~\_ At Ay y=NJ(x\i) A<t>^ {y, a, m ) + " A Ay J ANJ (x\i) _A[K2{N„.{x\i)) + aMNm'(x\ i), a, m')] ^ y=N„'{x\i)J Al Al where the first inequality is by Lemma 1, the second since Al and the third by Lemma 3. Consequently, [K2(Nrr,(x\i)) + a<t>i(Nm(x\i), «, m)]-[K2iN^' (x\i)) + a^N^' {x\i) , a, m')] is nondecreasing in i for every realization of X. Hence its expectation is also nondecreasing in i and the lemma is proved. Thus, by using Lemma 4 and following the same arguments as in Theorem 1, we have 02 has an optimal control limit policy given by display (1). By using the above arguments recursively, the following theorem is obtained. THEOREM 2: Assume that A" Ai j2[K„+i(i) + a(i>nii, a, k)]>0 for ^=0, 1, 2, . . ., s and that the optimal policy in period n has the control limit form given by display (1). Then is nondecreasing in i and the optimal policy in period n-^l has the control limit form given by display (1). COROLLARY: If K^ ■ . ., K^ are sufficiently convex such that A\K„+,{i)+a < i>^{i,a,k)] Ai'' - QUEUEENG SYSTEMS UNDER PERIODIC REVIEW 135 for 71=0, 1, • • •, n—1 and all k=0, 1, ■ • •, s, then the optimal policies for the first n periods have the control limit form. REFERENCES [1] Balachandran, K. R., "Control Policies for a Single Server System," Management Science, i5, 1013-1018 (1973). [2] Bell, C. E., "Characterization and Computation of Optimal Policies for Operating an M/G/1 Queuing System with Removable Server," Operations Research, 20, 2080-2180 (1972). [3] Heyman, D. P., "Optimal Operating Policies for M/G/1 Queuing Systems," Operations Research, 16, 362-382 (1968). [4] Magazine, M. J., "Optimal Control of Multi-channel Service Systems," Naval Research Logistics Quarterly 18, 177-183 (1971). [5] Sobel, M. J., "Optimal Average-Cost Policy for Queue with Start-up, and Shut-down Costs," Operations Research, 17, 145-162 (1969). [6] Zacks, S. and M. Yadin, "Analytic Characterization of the Optimal Control of a Queueing Sys- tem," Journal of AppUed Probabihty, 617-633 (1970). CYCLICAL JOB SEQUENCING ON MULTIPLE SETS OF IDENTICAL MACHINES* Helman I. Stern Ben Gurion University of the Negev Beersheva, Israel Edgardo P. Rodriguez World Bank Washington, D.C. Merlin L. Utter Proctor and Gamble Cincinnati, Ohio ABSTRACT The problem posed in this paper is to sequence or route n jobs, each originating at a particular location or machine, undergoing r— 1 operations or repairs, and terminating at the location or machine from which it originated. The problem is formulated as a 0-1 integer program, with block diagonal structure, comprised of r assignment subproblems; and a joint set of constraints to insure cyclical squences. To obtain integer results the solutions to each subproblem are ranked as required and combinations thereof are implicitly enumerated. The procedure may be terminated at any step to obtain an approximate solution. Some limited computa- tional results are presented. INTRODUCTION Much attention has been devoted to the classical n job-m machine shop scheduling problem. In most investigations each job is given a prespecified technological ordering. Less attention has been given to the problem of job processing options. For example, a job may be processed on machine A or machine B but not both. The problem posed in this paper is to sequence or route n jobs, each originating at a particular location or machine, undergoing r-1 ordered operations or repairs, in which the s"" operation may be performed on only one of n similiar machines; after which each job terminates at the machine from which it originated. The problem may also be visualized as the ♦This research was partially supported by National Science Foundation grant No. GK-27836. 137 \ ■ 138 H. I. STERN, E. P. RODRIGUEZ AND M. L. UTTER routing of a fleet of n vehicles or ships with each vehicle loading a commodity of type 1 at its origin location, delivering this commodity to any city in group 1, reloading a commodity of type 2 to be delivered to any city in group 2, etc. ; such that each city in a group is visited exactly once. On the final leg of the journey each vehicle either returns empty or discharges a commodity of type r at its origin location. The problem is also a special case of the m traveling salesman problem requiring m disjoint closed tours subject to a visitation constraint where each salesman must visit one city from each group in a specified order. To give the problem concreteness let there be n jobs and r different types of machines with n identical machines of each type (s=l, 2, . . ., r). Let the i'* machine of tj^pe s be represented as mi(S). Each job k is to initiate and terminate its sequence on the same machine of type 1, say machine mt(l). Job k must be processed on exactly one machine of each of the machine types 2, 3, . . ., r, in increasing order. No machine may process more than one job. Thus, the k'" job has the technological ordering {[wt(l)], [mi(2)orm2(2) or m„(2)], . . ., [mi(s) or ^2(5) ... or m„(s)], . . ., [mi(r) or m-aCr). or m„(r)], [m*(l)]| A feasible solution to this problem must have each machine assigned to one and only one job. An example of a feasible solution for 3 jobs and 4 machine types is shown in Figure 1, and consists of ^ disjoint cycles each comprised of four arcs. For this particular problem there are (3!) ^=216 feasible solutions. In general for an n job — r machine type problem there are {n])^~^ feasible solutions. If the cost of transfering a job of any kind from machine mi(s) to my(s-f 1) is given as Cij{s) the problem becomes one of finding an efficient algorithm that searches the set of feasible solutions and selects that solution (or solutions) that minimizes the total cost of sequencing all jobs. It is assumed that the processing costs on any ma- chine of a given type are job independent. Thus, if a processing cost of Ci(s) represents the cost to JOB JOB 2 JOB 3 MACHINE MACHINE MACHINE MACHINE TYPE 1 TYPE 2 TYPE 3 TYPE 4 Figure 1. — Cyclical sequencing of three jobs through four sets of machines. CYCLICAL JOB SEQUENCING 139 process any job on machine mi(s), it may be added to the transfer cost and included in Cij{s) without loss of generality. The problem is formulated as a 0-1 integer program, with block diagonal structure comprised of r assignment subproblems, and a joint set of constraints that insure closed tours. Since all variables are 0-1, the traditional Dantzig- Wolfe decomposition scheme is precluded. To insure integer results the solutions to each assignment subproblem are ranked as required and combina- tions thereof are implicitly enumerated in a branch and bound scheme. The procedure may be terminated at any step to obtain a feasible solution along with bounds on its accuracy. Some limited computational results are given in the last section. Related work has appeared in the literature starting with the truck dispatching problem of Dantzig and Ramser [1], Clark and Wright [2], and Newton and Thomas [3], in which all n tours originate and terminate at a single city with the number of cities visited by each tour implicitly conditional on the order of visits (due to demand variations at each city). Krolak, Felts, and Nelson [4] report on the work of Newton and Thomas in which multi-origin, single destination routes are scheduled. Bellmore, Liebman, and Marks [5] consider a multi-origin problem in which each tour is open but with ordered node set visitations. Svestka and Huckfield [6] solve an n closed tour prob- lem through m nodes such that n sorties of any length commence and terminate at a single node. Srivastava, Kumar, Garg, and Sen [7] consider a single closed tour visiting r cities, one from each of r ordered sets of cities, such that each set has a cardinality of at least one. MATHEMATICAL FORMULATION The problem may be formulated as a 0-1 minimum cost network flow problem with closed tour side conditions. The underlying network is depicted in Figure 2 where the node set Ns represents the set of n machines of type s. The node sets are ordered from left to right with A^i repeated and designated as N^+i as a visual aid. The only arcs in the network are those emanating from each set of nodes to the next higher set of nodes. The arcs from A'^, to A^'r+i are return arcs required to complete cycles. Let the machine sequence of each job start from its origin node in A^^i, proceed through a single node from each of the remaining ordered sets, A^2, -A^s, • • •, A^r, and end at its origin node repeated in Nr+i- In Figure 2 the only arcs shown are those possible for a cyclical 1 o k 1 • \^^ ^. • y/^ • « y^^ ' .V ki 1 •% k \. kn ] • ^s. • « ^"v^ n Si." O 'So V ,^ . ^■Hl ("^^ r7\ o % k o • j o— k ^\ X.. ^ Jk ^- k V n O X(l) X(2) X(s) X(s+1) X(r) Figure 2. — Network representation of problem (only flows for job k are shown). >- 140 H. I. STERN, E. P. RODRIGUEZ AND M. L. UTTER machine sequence for job k. The variables, Xf/'is), represent the amount of job k flowing from node i in Ns to node j in Ns+i. The set of all flows from Ns to Ns+i is represented as X(s). To in- sure full job assignments to machines let Xij^{s) be restricted to a unit value if the A:"* job is se- quenced on machine j in A^^+i directly after machine % in N^. Otherwise, the flow will be restricted to zero. Total flows through each node in th network will be restricted to unity so that exactly one job will be sequenced through each machine. In addition, only the k'-^ job may flow through node k of Ni. Thus, one may drop the superscripts from X{1) and X{r) and reduce the number of flow variables to those described by yui (1) and yik (r). (This is an arbitrary stipulation, as any one to one correspondence between the n jobs and n machines or origins in A^i will suffice.) This final restriction insures that the k"' job originates and terminates its sequence at node k of sets A^i and A^r+i- In addition, define 0-1 variables yais) for s=2, 3, . . ., r—1 which represent the joint flows from each arc (i, j) in {N„Ns+i). Let Cij(s) represent the cost (distance) of any job transferred from i in N^ to j in Ns+i, and Z represent the total cost incurred by all job transfers. Then the problem of minimizing the total job sequencing cost may be formulated as a minimum cost flow problem with 0-1 integer flows and closed tour side constraints. The mathematical for- mulation of this problem is shown below as problem (P). Integer Program (P) MinZ= S c,,(l)y.,(l)+X: Z) Ctj{s)y,j{s)+ Z) Cj,(r)yj,{r) (k,i)t{NiNi) »=2 (i,j)e(W„JV.+,) U. k)e{Nr, Nr*,) Subject to: (1) S 2/o(«) = l, i^Ns (2) j:yij(s) = l,jeNs+i jtN, S = l, (3) yUl)-i: x,/(2)=0, (k, i)e(N„ N2) (4) yi^i2)-t: x,/(2)=0, a, j)eiN2, N,) k = l (5) i:x,/(s-i)-s xjm=o,jeN, i(N,-i ifN,-n (6) yijis)- i: x,,'{s)=0, a, j)eiNs, N,+^\ k = l 8=3, . . .,r—l (7) yAr)- S x,/(r-l)=0, {j, k)e{Nr, iV.+i) UNr-i (8) y,j{s)=0,l;ii,j)e{Ns,N,+,) s = l, . ., r (9) x,j'(s)=0,l;ii,j)e{N„N,+,) s=2, . .., r-1 k—1, . . .,n Equations (1), (2), and (8) represent r independent assignment problem constraints insuring that each node has unit flow. Together they provide n disjoint job sequences from nodes in Ni tc nodes in N^+i not necessarily closed. The interstage coupling constraints (3), (4), (5), (6), (7), anc CYCLICAL JOB SEQUENCING 141 (9) provide the conditions necessary for all tours to be closed (cyclical sequences) . Upon rearrange- ment, the entire set of constraints can be shown to exhibit a block diagonal structure. DECOMPOSITION OF (P) Let (B) represent the set of closed tour coupling constraints. Removal of (B) from (P) allows the remaining portion of (P) to be decoupled into r independent assignment problems. Let (.^5) represent the s"" stage assignment problem from machines in A^'j to machines in Ns+i. Thus, (A,) is of the form. Minimize ii,j)t(.N.,N.+,) Subject to : UN. yij(s)=0, 1; {i, j)e{Ns, Ns+0 Let (A) represent the union of all r assignment problems with a feasible solution defined by [Z, Y] such that; r Z=2-i Zs, Y=[Y], . . ., Ys, . . ., Yf], » = i where Ys is a feasible solution to (A,) and Zj is its associated cost. Note that an optimal solution [Z°, F°] to (A) provides a lower bound to (P). If, in addition, there exists a solution [X, Y"] to (B), then [Z°, Y", X] is also optimal for (P). Otherwise, one may enumerate all feasible solutions to (A) and select, among those that also provide a feasible solution to (B), the one with the lowest objective value. Such a process, however, requires the computation of (n!)'' solutions to (A) and an equal number of feasibility tests in (B).'\ A problem with n=7 and r=3 requires the determination 6.4 X 10" feasible solutions to (A) . Hence, an implicit enumeration branch and bound procedure is proposed. This procedure requires an efficient technique for de- termining if proposed solutions to (A) satisfy the side conditions in (B), as well as a method of constructing upper bounds on (P). PROPOSITION 1 : Testing Y for Feasible Closed Tours If (Z*, Y*) is a feasible solution to (A), then (B) has either a unique solution [X*, Y*] and Z* provides an upper bound for (P), or there does not exist a solution to (B) corresponding to Y*. This result follows from the staircase structure of the constraint set (B) where (2) and (4) link stages 1 and 2, (5) and (6) link stages s and s+1, for s=2, • • •, r— 1, and (7) hnks stages r— 1 and r. Since each assignment solution provides exactly n variables with unit values and n(n—l) variables at zero, the Fi*, • • • Yr-i determine uniquely Xi*, ■ ■ ■ Xr-u If Yr*, Xr-i also satisfy (7), Z* provides an upper bound for (P). Otherswie, there does not exist a solution to (P) corresponding to Y*. Intuitively, one is projecting the paths of all n jobs, stagewise, from their origin nodes in A^i to their terminating nodes in Nr+i. Constraint (7) provides the test to determine if the k'" job originates and terminates at the k'" node in Ni. In lieu of sequentially solving the set of coupling equations for the a;f/(s) one may incorporate a composite function on the subscripts tNote that all feasible solutions to the assignment problem are basic solutions. 142 H. I. STERN, E. P. RODRIGUEZ AND M. L. UTTER of the Y* solution for an easily computerized closed tour test. Moreover, if Y* does not provide a closed tour solution, fixing Y* for any r— 1 stages yields a forced feasible solution for the remaining r'" stage providing a feasible upper bound on (P) . This observation yields the following proposition : PROPOSITION 2 : Construction of Revised Feasible Solution Given (Z*, Y*), a solution to (A), which does not provide a feasible solution for (B). One may obtain a feasible solution to (B) to by constructing a revised solution Y*'=[Y,* ., Y*, . . . Y*] where Y*s is the unique s**" stage solution to (A,) which allows the construction of a feasible solution to (B) (a revised solution) and provides a feasible upper bound of Z*' on (P). By computing a set of r revised upper bounds Z*\ s= 1, . . ., r, for (P) the tightest upper bound, Z*, may be determined as, _ Z*=Min Z*' s=l, . . ., r Example of Revised Solutions Consider a solution shown pictorially in Figure 3a for a three machine problem with three jobs. This solution is feasible for (A) but infeasible for (B). The three uniquely determined revised solu- tions (feasible for (B)) are shown in Figures 3b, 3c and 3d. •►o "-O 0. GIVEN SOLUTION, Y C REVISED SOLUTION, Y^ o »-o —c »-o b. REVISED SOLUTION, Y ^-^ 4'' x /\ / \ d. REVISED SOLUTION, Y' Figure 3. — Example of revised solutions. CYCLICAL JOB SEQUENCING ' 143 TERMINOLOGY The results of propositions 1 and 2 provide the constructive basis for determining solutions to (A) and upper bounds on (P). To facilitate the desire to implicitly enumerate all potential solutions of (A) the following definitions are introduced : 1. The Ranked Assignment, Yst'- The assignments to (As) may be ordered by non-decreasihg values of Z„ such that Y„ rep- resents the i" ranked assignment and Z,, its associated cost. The ranking index t=l, . . ., n\ is selected by the rule, ¥a, ftefl, . . ., n!} 3a>6, Zja^Zjft 2. A Node, r*: Any feasible solution to (A) say, the one identified as the k^ solution, may be represented as a node, y*, where, F*={n ,Fj,.,...,7?a 3M{1, . . . ,n!},s = l, . . . ,r. Where clear, a node may be represented as an ordered set of assignments, each assignment represented by the integer equal to its rank, i.e., (ti, . . ., <j, . . . tr) 3. Set of Solutions : Let P„j, . . ., Us, . . . Ur be the set of all solutions to (A) less the (u,— l) best solutions for (A,), 5=1, . . ., r, i.e., P„„ ...,«„... Ur={{t,, . . . ,t„ . . . , tr)/t,'^u„ s=l, . . . r}. Thus, Pi 1 , is the set of all feasible solutions to (A). (When clear we shall refer to the set of all feasible solutions as P.) 4. Cost of Node Y\ Z*: The sum of the costs of each of the assignments in F* is »=1 r 5. A Feasible Node, Y*: A node, F*, shall henceforth be called feasible if and only if it provides a feasible solution to (P). (This may be determined by the results of Proposition 1.) 6. A Revised Node, F*' : A revised node, F**, is the node obtained from an infeasible node F* by replacing the s'* as- signment F*,^ by the unique assignment FJu^ such that the new node becomes feasible: {Us9^ts) 7. Cost of Revised Node F*^ Z**: The cost associated with the revised node F** is Z*'=Z*-Z*,.-f-ZL. 8. Upper Bound Associated With Node F*, Z*: An upper bound on (P) associated with node F* may be determined as Z* if F* is feasible, or from its set of revised nodes. (See Proposition 2.) 144 H. I. STERN, E. P. RODRIGUEZ AND M. L. UTTER If F* is infeasible I.e., Z*= Min Z*» 8 = 1 r _ [ Z*, if F* is feasible ■^*= Min Z*S if F* is infeasible [» = 1 r 9. Dominance Between Nodes: The node (ui, . . .,u„ . . . Ur) is said to be dominated by the node (vi, . . ., Vg, Us^Vs, s=l, . . . r. It follows from 1 and 4 above that if node F* is dominated by node F' then In Figure 4 all nodes below the dotted line are dominated by the node (1,2, 1). .Vr)if IMPLICIT ENUMERATION OF SOLUTIONS THROUGH BRANCH AND BOUND OF SETS OF RANKED ASSIGNMENTS In this section the general rational for searching the solution set P is presented in terms of a branch and bound algorithm. The algorithm is initiated by generating the node F' (1, . . .,1, . . ., 1) and testing it for feasibility using the results of Proposition If. If F' is feasible then it represents an optimal solution to (P) with optimal cost Z'. Otherwise, Z' offers a least lower bound to (P), in which case an upper bound may be constructed from F^ by employing the result of Proposition 2. The scheme for generation of future nodes shall incorporate the assignment ranking procedure of K.G. Murty [8]. It is useful to present the scheme in terms of the tree diagram shown in Figure 4 (for a three stage problem) with each node represented as a point and arrows between points representing branching from a predecessor node to a successor node. (I.I, I) (2,1,1) (1.1,2) (1.1,3) Figure 4. — Illustrative tree for a three stage problem. tAccording to Definition 2 this node represents the first ranked solutions of all assignment subproblems. CYCLICAL JOB SEQUENCING 145 Branches in the tree are limited to those from a node of the form (^i, . . ., ts, . . . tr) to suc- cessor nodes of the form (^1+1, . . .,U, . . . tr), . . ., (tl, . . ., ts+1, . . ., tr), . . ., (tu . . .,ts, . . . tr-{-l). Any of the successor nodes, say the s'" one, may easily be determined from its predecessor through the construction of the next best ranked assignment Y,,t,+i for s'" stage assignment problem. To reduce the information storage requirements the ranked assignments are exhausted one stage at a time (to be clarified subsequently). Associated with each node in the tree, say the k'", are its cost Z* and an upper bound Z*. If the node is feasible then Z* provides an upper bound and is set equal to Z*, otherwise an associated upper bound Z* is determined through construction of revised nodes (See Definition 8). The best solution determined for a tree with k nodes may be defined as the least upper bound Z„*= Min. Z' 1=1 k In the ^+1 8t step the best upper bound may be updated by the rule, 'Z„*, if Z*+i^Z„* 7* 7fc + l. Z*+i, if Z'+'<Z.. k To check for the existence of dominated nodes at step k-\-l (the dominance test) compare the cost, Z*+^ associated with the node F*+^ (feasible or infeasible), with Z/+^ If Z*+^^Z„*+' the set of nodes dominated by F*"*"^ may be removed from the tree as they will exhibit costs not less than those of the best solution found thus far. As more nodes are examined, the tree is reduced in size until a point is reached when all nodes in the reduced tree have been examined. If this point is reached at step q, then Z„' is the optimal cost. This is true since Z„* represents the least upper bound on all nodes explicitly examined, and is a better solution than those nodes implicitly ex- amined through placement in the dominated set. The algorithm will terminate in a finite number of steps since the number of nodes in the initial tree is (n!), and unexamined nodes in the tree are evaluated at each step with at least one being removed. This may be shown in set theoretic notation as follows : Let T*=the set of nodes in the tree at step k (reduced tree). £'*=set of nodes in T" examined in previous steps. £'*=set of nodes not yet examined in T*. Z>*=set of nodes dominated by nodes in £"*. Thus, at step k of the algorithm the set of nodes P may be partitioned into three noninter- secting sets, i.e.. On the k+l st iteration a node F*+' is selected from E" and placed in E". If F*"*"^ determines a set of dominated nodes those dominated nodes not in £■*, say J9(F*"'"^), will also be deleted from E" and added to D\ 146 H. I. STERN, E. P. RODRIGUEZ AND M. L. UTTER Thus, l£:*+i|=l£:*l+i l^*+j|=l^*l-i-|D(r*+»)| |7?*+i|=|Z>*|+|D(F*+')| and |£'*+'|<|£'*|. Since \E?\=P and P is finite, termination occurs at some point k=q^E''=0. A sketch of the algorithm follows THE ALGORITHM Initialization : Let A:=l, j^*=P, £:*=Z>*=0, Z„*=co Step 1 : Select a node F* from E". If 5*=0, STOP: OPTIMAL SOLUTION IS Z„*. Step 2: Compute Cost of F* Z — ^ Zj, *=1, ...r ' Step 3 : Feasibility check Let 7*= 1 , if F* is feasible 0, if F* is infeasible Step 4 : Upper bound associated with F* [Z*,if7*=l I Min Step 5 : Update best solution '^*— ^ Min Z**, if 7*=0 ^, J Z„*, if Z*^Z„» " 1 Z*, if Z*<Z„* Step 6 : Determine dominated nodes at F* DrF*W J^"- - '. . 'o if T*=l or Z* ^ Z/ ^ ^ 10, if7*=0 where Pt,,.t....tr..., is the set of nodes dominated by the node ¥"={¥^1^ . . . F^, . . . F*,,} Step 7 : Update Solution Sets £:*+»= £:*+F* E*+'=^-F*-I>(F*) Z>*+i=D*+I>(F*) Return to Step 1 The selection of a node from jE* deserves further explanation. In the actual programming of the algorithm (see [9]) the set operations are performed implicitly, and hence no bookkeeping CYCLICAL JOB SEQUENCING 147 requirements are necessary to record the set element transfer operations in Step 7. Moreover, the elements in E'' are generated as selected. This selection decision is designed to fulfill two objectives, (i) reduce storage in the assignment ranking routine, and (ii) accelerate the generation of dominated nodes through the exploitation of all computed assignments thus far. The selection scheme may be described with the aid of the list of ranked assignments shown in Table 1. Table 1. — List oj Ranked Assignments Ranked Subproblem assignment Ax A, • ' • Ar 1 2 Yn Y,^ . . . F,2 Yn Yr2 Yxu . . . r«. . . . Yrt. n! Fi„, . . . Y sn\ • • • Yrnt A node Y" is an ordered set of elements selected from Table 1 in such a manner that one element is selected from each column in the list and ordered column wise. Hence, the list represents the basic data for generating all (nl)' solutions in P. Any column in the list may be generated using Murty's assignment ranking technique. This technique requires a considerable amount of stored data from the t—l first assignments to determine the i"" ranked assignment. Thus, it is efficient to exhaust the generation of assignments one column at a time. One proceeds from the origin node (1, • • •, 1) • • • 1) and generates successive nodes by ranking the assignments of the first stage assignment problem until it is determined that a node using the next assignment in the first column of the list is feasible or provides inferior solutions by the law of dominance. (See Step 6.) This is illustrated in Figure 5 as path 1 for a three stage example. One then backtracks and starts to rank the assignments of the second stage. The third stage assign- ments are not started until all undominated nodes involving combinations of already ranked solutions in stage 1 are examined. These combinations for the example illustrated in Figure 4 are shown in paths 2 and 3. One continues in this fashion until all stages have been ranked. The details of exhausting all combinations of previously generated assignments may be found in reference [9] which is available from the authors upon request. The efficiency of this scheme is demonstrated in the following three stage example where only 9 out of a possible 216 nodes are explicitly examined. Moreover, the scheme may be terminated at any step to obtain an approximate feasible solution and a bound on the cost error of this solution through comparison with the lower bound found in the first step. 148 H. I. STERN, E. P. RODRIGUEZ AND M. L. UTTER 1 ^ 'V^ (',1,2,1....1) (1,2,2,1,.., I) / Figure 5. — A tree to illustrate the sequence of paths followed during the branching scheme. EXAMPLE The cost matrices for this three job-three machine stage example and the associated list of ranked assignments are shown in Tables 2 and 3, respectively. Table 2. — Cost Matrices for Three Stage Example Stage 1 Stage 2 Stage 3 "3 5 9 10 1 2 8 5 4 6 2 " 1 8 10 5 2 5 _ ■4 8 ~ 2 2 7 _ Table 3. — Ranked Assignments j or Three Stage Example Stage 1 Stage 2 Stage 3 Rank Assignment Cost Rank Assignment Cost Rank Assignment Cost 1 312 5 1 231 8 1 312 2 2 213 12 2 123 8 2 123 4 3 321 15 3 321 11 3 132 6 4 132 16 4 132 12 4 321 7 5 123 17 5 213 20 5 213 10 6 231 21 6 312 27 6 231 17 The assignments in Table 3 are represented as a permutation of the sequence (1, 2, 3). The computational and branching results for each step of the algorithm are shown in Table 4 and Figure 6, respectively. CYCLICAL JOB SEQUENCING ' 149 Table 4. — Nodal Information at Each Step of the Algorithm for Three Stage Example Upper Min Set of Step Node Cost bound upper bound Feasibility dominated nodes k F* Z* 1} z„* 7* D(F*) 1 (1, 1, 1) 15 23 23 2 (2, 1, 1) 22 27 23 3 (3, 1, 1) 25 25 23 ^3, I, 1 4 (1, 2, 1) 15 31 23 5 (2, 2, 1) 24 25 23 P21 2, 1 6 (1,3, 1) 18 22 22 7 (1,4, 1) 19 23 22 8 (1,5, 1) 27 22 22 ^1, 5, 1 9 (1, 1, 2) 17 17 17 1 •f* 1, 1, 2 Z^ FEASIBLE (OPTIMAL SOLUTION) Z8>Z^8 Figure 6. — Tree illustration of three stage example. 150 H. I. STERN, E. P. RODRIGUEZ AND M. L. UTTER The optimal cyclical job sequences are shown in Figure 7. Figure 7. — Optimal cyclical job sequences for three stage example. It is noted that of the 6 ranked assignments for each stage only the first 3, 5 and 2 ranked assignments for stages 1, 2 and 3, respectivel}^ had to be computed in the example. A computer program has been written in Fortran IV for the IBM 360/50 to apply the algorithm for r equal three [9]. Eight sample problems were solved using this code with execution times shown in Table 5. The reader should be cautioned that these times are expected to increase as additional stages are added beyond three, although no computational evidence has been accumu- lated as yet. Table 5. — Computer Results j or Sample Problems (r—S) Problem number Problem size (n) Execution time (seconds) 1 3 8.5 2 4 8.8 3 4 9.5 4 5 8.7 5 5 14.5 6 7 12.3 7 7 21.2 8 10 36. 1 ACKNOWLEDGMENTS The authors wish to give their appreciation to Dr. Richard Francis of Ohio State University for securing permission to use the assignment ranking computer program referenced in [10]. REFERENCES [1] Dantzig, G. B. and J. H. Ramser, "The Truck Dispatching Problem," Management Science 6, 80-91 (1959). [2] Clarke, G. and J. W. Wright, "Scheduling of Vehicles from a Central Depot to a Number of Delivery Points," Operations Research, 12, 568-81 (1964). CYCLICAL JOB SEQUENCING ' 151 [3] Newton, R. M. and W. H. Thomas, "Design of School Bus Routes by Computer," Socio- Economic Planning Sciences, 3, 75-85 (1969). [4] Kj-olak, P., W. Felts and J. Nelson, "A Man-Machine Approach Toward Solving the General- ized Truck-Dispatching Problem," Transportation Science, 6, 149-170 (1972). [5] Bellmore, M., J. C. Liebman and D. H. Marks, "An Extension of the (Szwarc) Truck Assign- ment Problem," Naval Research Logistics Quarterly, 19, 91-99 (1972). [6] Svestka, J. A. and V. E. Huckfeldt, "Computational Experience with an M-Salesman Traveling Salesman Algorithm" Management Science, 7, 790-799 (1973). [7] Srivastava, S. S., S. Kumar, R. C. Garg and P. Sen, "Generalized Traveling Salesman Problem Through n Sets of Nodes," Canadian Operational Research Society, 7, 97-101 (1969). [8] Murty, K. G., "An Algorithm for Ranking All the Assignments in Order of Increasing Costs," Operations Research, 16, 682-687 (1968). [9] Stem, H. I., P. Rodriguez and M. Ij. Utter, "The M-Traveling Salesman Problem with Ordered Visits," Operations Research and Statistics Paper No. 37-71-P5, Rensselaer Polytechnic Institute, Troy, New York (1971). [10] Fluharty, E., "Solving the Quadratic Assignment Problem by Ranking Assignments," Masters Thesis, Ohio State University (1969). JOHNSON'S APPROXIMATE METHOD FOR THE 3 X m JOB SHOP PROBLEM Wlodzimierz Szwarc and George K. Hutchinson School of Business Administration University of Wisconsin- Milwaukee Milwaukee, Wisconsin ABSTRACT The effectiveness of Johnson's Approximate Method (JAM) for the 3 X w job shop scheduling problems was examined on 1,500 test cases with n ranging from 6 to 50 and with the processing times Ai, B„ C, (for item i on machines A, B, C) being uniformly and normally distributed. JAM proved to be quite effective for the case B, < max (A ,, C,) and optimal for S, < min (A „ C,) . 1. INTRODUCTION The 3 y, n job shop problem can be defined as follows. Machines A, B, C process each of the n items 1,2,. . ., n in the order ^ 5 C. Given the processing times {A^, Bi, Ci for item i on machines A, B, C respectively) , the problem is to find a sequence minimizing the timespan. S. M. Johnson [2] provided a very quick method for solving the 2 X n problems which, when adopted to the 3 X n case, reads as follows. Solve instead a 2 X n problem assuming the processing times for item i on the first and second machines to be Ai-\-Bi and Bi-\-Ci respectivel3^ The resulting sequence p is not necessarily optimal, however. R. J. Giglio and H. M. Wagner [1] tested Johnson's Approximate Method (JAM) on twenty 3X6 trials (with processing times uniformly distributed between 1 and 30). In 9 out of 20 trials the optimal sequence was produced. To check whether p is optimal they had to enumerate all n! per- mutations (in general there is no other way) and find the respective elapsed time. Let T{p) and t(p) be the timespans for sequence p in the 3 X n and 2 X n {A-\-B, B-\-C) prob- lems respectively. W. Szwarc established in [3] sufficient optimality conditions for JAM. Sequence p is optimal for the 3 X ?^ problem if (1) T{p) = t{p)-± B, i = l REMARK: The right hand side of (1) is actually the lower bound of the minimal timespan of the three machine problem. 153 154 W. SZWARC AND G. K. HUTCHINSON The efficiency of Johnson's two machine method makes it possible to test JAM (via (1)) for practicality any number of items. There were examined 1,500 3 X n. trials with n ranging from 6 to 50. The processing times were being drawn from a uniform and a normal distribution. The results show that JAM is quite efifective if Bi< max {At, d) for all i (see Tables 1 and 2). JAM always produces an optimal solution for the case Bi< min (Ai, C,) which lead us to believe that a theorem confirming optimality of JAM (for this case) was true. This was subsequently proven in [4]. We feel that JAM is actually much more effective than the results indicate. This is due to the fact that (1) is too strong a condition. Moreover (see Section 3) its effectiveness seems not to depend on the num- ber of items. 2. COMPUTATIONAL RESULTS The program was written in ECSL and run on the UNIVAC 1110. There were performed 30 separate runs, each providing three separate streams of random numbers A^, Bt, d for the 50 3 X w trials as well as calculating T{p) and its lower bound. The results are summarized by Table 1. Table 1 Distribution of Au B„ C, Restrictions on Bi Number of items n Row 6 10 20 50 Normal m=50, <r=5 Bi< max (Ai, C.) 43 48 46 47 47 43 — 1 Uniform 1-99 Bi< max (Ai, d) 42 32 46 47 43 44 49 44 2 None 18 14 21 16 24 25 21 24 3 Bt< min (Ai, C.) 50 50 50 50 50 50 50 50 4 REMARKS: 1. Each cell registers the results of two independent 50 trial runs. 2. The numbers in the cells indicate how many times per run (i.e., out of 50) was T{p) equal to its lower bound. 3. The three streams generate numbers in Ai, Bi, Ci order for each i=l, . . . , n. However, in the case when Bi are restricted, the values of B, are interchanged whenever necessary with Ai or (and) C, to comply with the appropriate restriction. 4. The run times in CPU seconds per run (50 trials) for n=6, 10, 20, and 50 are 25.69, 47.10, 129.13, and 588.93 respectively. 3 X W JOB SHOP PROBLEM 155 3. STATISTICAL VERIFICATION OF THE RESULTS The first analysis was to examine whether the random numbers generated in each run affect the results within each cell. According to the Standard Chi-Square Test based on the data in the first two rows (excluding 7i=50) the two results in a cell are not statistically significantly different on the 20% level. Moreover, there are no statistically significant differences (at the 20% level) between 1) the distributions, 2) the number of items n, 3) the interactions of n and the distributions. Applying the same technique to the data from the last three rows, we found that 1) there are no statistically significant differences between the interactions of n and the restrictions on Bi (level 20%) ; 2) as expected there is a strong statistically significant difference between the various restrictions on Bi (level 0.1%). The conclusion that the effectiveness of JAM does not depend on n seems rather surprising. Hence we decided to verify it on another test. The ratio of the actual T{p) to its lower bound (in percents) was found for each of the 800 trials from rows 2 and 3. The entries of Tables 2 and 3 register the number of trials belonging to a specific ratio and number of items category. The numbers on the left hand side in Tables 2 and 3 symbolize unit intervals. For instance 104 means a half closed interval [104, 105]. Ratio / actual time \ Vlower bound/ Table 2. — Case J?i< max {At, d) Number of items 100 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 6 10 20 50 83 97 97 100 7 1 3 3 1 1 1 3 1 ^ 1 1 Total 377 11 4 2 3 1 1 1 156 Ratio W. SZWARC AND G. K. HUTCHINSON Table 3. — No restriction on 5,. Number of items / actual time \ \lower bound/ Tnto 6 10 20 50 L oia 100 35 41 52 52 180 101 5 3 3 4 15 102 4 5 4 1 14 103 7 3 4 1 15 104 5 9 1 6 21 105 3 2 5 4 14 106 4 6 2 4 16 107 2 2 3 2 9 108 1 3 1 2 7 109 3 1 4 8 110 2 4 1 3 10 111 2 3 2 7 112 2 2 4 113 2 1 1 1 5 114 3 2 1 1 7 115 2 4 3 4 13 116 1 1 117 2 2 3 2 9 118 1 1 2 119 1 1 2 120 3 1 1 5 121 1 1 3 5 122 1 1 123 3 1 1 5 124 1 1 2 125 2 2 126 2 1 3 127 128 over 128 7 7 2 2 18 The distributions were compared pairwise using the Kolmogorov-Smirnov Two Sample Test. The null hypothesis that there were no differences could not be rejected at a 10% level of significance. The results from Table 2 show that almost all trials are within a one percent range from the lower bound (row 1) for ?i>10. Note also that the entries in the first rows of tables 2 and 3 happen to increase as n increases. 3 X ^ JOB SHOP PROBLEM ' 157 REFERENCES [1] Giglio, R. J. and H. M. Wagner, "Approximate Solutions to the Three Machine Scheduling Problem," Operations Research, 12, 305-324 (1964). [2] Johnson, S. M., "Optimal Two and Three-Stage Production Schedules with Setup Times Included," Naval Research Logistics Quarterly, 1, 61-68 (1954). [3] Szwarc, W., "Mathematical Aspects of the 3 x n Job-Shop Sequencing Problem," Naval Re- search Logistics Quarterly, 21, 145-153 (1974). [4] Szwarc, W., "Special Cases of the Flow-Shop Problem," School of Business Administration, University of Wisconsin-Milwaukee, Dec. 1975; to appear. Naval Research Logistics Quarterly, 24, No. 3, Sept. 1977. A CONVEX PROPERTY OF AN ORDERED FLOW SHOP SEQUENCING PROBLEM* S. S. Panwalkar Department of Industrial Engineering Texas Tech University Lubbock, Texas A. W. Khan American Science and Engineering Company Houston, Texas ABSTRACT A flow shop sequencing problem with ordered processing time matrices is con- sidered. A convex property for the makespan sequences of such problems is dis- cussed. On the basis of this property an efficient optimizing algorithm is presented. Although the proof of optimaJity has not been developed, several hundred problems were solved optimally with this procedure. 1. INTRODUCTION Consider the classical n job m machine flow shop sequencing problem. Recently, Smith, et al., [1] have introduced a subcategory of the flow shop problem with "ordered processing times." Two properties are stipulated for such an "ordered flow shop problem." (i) If a particular job 'a' has a smaller value of processing time on any machine compared to another job 'b', then job 'a' will have smaller (or equal) processing time compared to job 'b' on all machines, and (ii) if any job has its^"* smallest processing time on any machine, then every job will have its J*"" smallest processing time on the same machine. Note that due to the characteristics described above, we can conveniently number jobs in the ascending order of processing times. In the following we will consider the ordered flow shop problem with minimum makespan criterion and permutation schedules only. 2. NOTATION Let A'^ denote the set of n jobs and let the jobs be numbered in ascending order of their process- ing times. Job n will then be referred to as the "largest job." ♦Partial support for this research was provided by the National Science Foundation Grant GK-2869. 159 160 S. S. PANWALKAR AND A. W. KHAN Let the set A^^ be divided into two partitions, a and <r' such that a contains at least one job {a' may be empty if a=N). Assume for convenience that the largest job is always included in <r. Let r denote thenumber of jobsin (r(l<r<n). 3. PREVIOUS RESULTS Smith et al., [1, 2] have presented some interesting results which can be represented by the following two theorems. The statements of the theorems are modified slightly to match the above notation and the proofs of these theorems may be found in [1, 2]. Theorem 1 [1] : In the ordered flow shop problem a minimum makespan sequence is given by arranging jobs in descending (ascending) order of processing times if the largest processing time for every job occurs on the first (last) machine. THEOREM 2 [2] : In the ordered flow shop problem there exists a minimum makespan se- quence of the form aa' , where jobs in <j are arranged in ascending order of processing times followed by jobs in a' in descending order of processing times. It may be noted that the second theorem is a more general one. Thus there are 2"~^ sequences (for all values of r) in which an optium sequence can be found. Each sequence satisfying Theorem 2 is said to have a "p3'ramid structure" and for a sequence with pyramid structure, the value of r indicates the position of the largest job. Smith et al., [2] have presented an enumeration algorithm (S-P-D method) to evaluate 2"~' sequences and recommend the use of a branch and bound procedure for further improvement. 4. CONVEX NATURE OF THE ORDERED PROBLEM In order to explore some additional properties of the ordered problems (and possibly to develop a more efficient solution procedure), several problems were solved by the S-P-D method. For e9,ch problem, the makespan values of all 2"~^ sequences were analyzed. For a given value of r defined above, there are „_iCr-i possible sequences with a pyramid struc- ture. Let Sj- represent the set of these sequences. Let S*r represent the best sequence and T*r, the corresponding value of the makespan. If an optimal sequence for the complete problem (satisfying Theorem 2) occurs corresponding to r=k, it was observed that for all enumerated problems T\>T*,> ■ ■ ■ >TU>T\<TU,< . • • <T\ This characteristic will be explained with an example. Consider a 6 job 6 machine problem in Table L For this problem the S-P-D algorithm will generate 32 sequences. Table I. — Processing Time Matrix for an Ordered Flow Shop Problem Machine Job 12 3 4 5 6 1 3 13 7 6 8 9 2 19 85 42 37 50 55 3 26 116 57 51 68 75 4 29 130 64 56 76 84 5 36 161 80 70 95 104 6 68 304 150 132 179 197 ORDERED FLOW SHOP SEQUENCING 161 These sequences can be divided into the 6 sets as follows : <Si = (654321) <S2= (165432, 265431,365421,465321, 564321) S3=(126543, 136542, 146532, 156432, 236541, 246531, 256431, 346521, 356421, 456321) /S4= (123654, 124653, 125643, 134652, 135642, 145632, 234651, 235641, 245631, 345621) <S5= (123465, 123564, 124563, 134562,234561) ^8= (123456) The sequence with minimum makespan in each of the above sets and the corresponding value of makespan are given below. r*i=1357 r*2=1338 T*3=1332 r*4=1373 T*5=1419 T*8=1476 The makespan for the best sequence in each set is plotted in Figure 1 against the position of the largest jobrj, in that sequence. With each change in position of the largest job starting with any end, the value of makespan decreases until the optimal value is reached and then it increases. It may be noted that although the figure represents discrete quantities, the lines drawn through various points represent a "convex" shape. A mathematical proof for such a property has not been developed. All enumerated problems, however, exhibited this property. A simple algorithm based on this property has been presented in the appendix. Two hundred problems ranging in sizes 6X15 to 14X15 were solved by the proposed method as well as a branch and bound method using Theorem 2 properties. All problems by the proposed method gave optimal results (as confirmed by the branch and bound solutions). The proposed method proved to be more efiicient in all cases. Finally, we would like to encourage readers interested in the problem to develop mathematical proofs. iS'*i = 654321 S'*2= 265431 5*3=126543 <S*4= 123654 <S*5= 123465 -S*fi= 123456 1450 z CO UJ < 1400 1350- 1300 2 3 4 5 POSITION OF LARGEST JOB Figure 1. — Variation of minimum makespan with the position of largest job n. 162 S. S. PANWALKAR AND A. W. KHAN APPENDIX The step by step algorithmic procedure for the proposed algorithm is as follows: STEP 1 : Renumber the jobs according to the ascending order of magnitudes of the processing times. STEP 2: Set i=n-{-l and set T*n+i= <». STEP 3: Set i=i-l. If i=0, go to Step 6. STEP 4: With largest job in the i*" position evaluate all sequences with pyramid structure for makespan (i.e., all sequences with jobs in ascending and then descending order of processing times). When each new sequence is generated, compare with the previous best sequence and store only the best sequence. When all the pyramid shaped sequences with largest job in i*^ position are evaluated, denote the best dequence by S*i. STEP 5: If T*i<T*i+i go to Step 3. STEP 6: The sequence S*i+i is an optimal sequence. I REFERENCES [1] Smith, M. L., S. S. Panwalkar, and R. A. Dudek, "Flow Shop Sequencing with ordered Proc- essing Times Matrices," Management Science, 21, 544-549 (1975). [2] Smith, M. L., S. S. Panwalkar and R. A. Dudek, "Flowshop Sequencing Problem with Ordered Processing Time Matrices: A General Case," Naval Research Logistics Quarterly SS, 481-486 (1976). II A MANPOWER PLANNING/CAPITAL BUDGETING MODEL (MAPCAB) Rolf H. Clark Office of Chief of Naval Operations Washington, D.C. Robert A. Comerford University of Rhode Island Kingston, Rhode Island ABSTRACT A deterministic resource allocation model is developed to optimize defense effectiveness subject to budget, manpower, and risk constraints. The model consists of two major submodels connected by a heuristic. The first is a mathematical pro- gram which optimizes the multiperiod weapon mix subject to the constraint set. The second is a manpower supply model based on a transition matrix in which individual transitions are functions of personnel related budgets and historical transition rates. The heuristic marries the submodels through an iterative process leading to improved solutions. An example is provided which demonstrates how systems are undercosted and overprocured if manpower supply is not properly reflected relative to manpower demand. I. INTRODUCTION Background In economic decisions costs are normally assigned to an investment dependent on the inputs used to produce and operate it, Similarly, weapon systems are typically costed by the resources needed to develop, operate, and support them.* Because of the unique conditions of military manpower development, costing the manpower inputs as a function of manpower used is not only inaccurate, but actually erroneous. The error results because manpower levels required to man a set of weapons may require large pipeline inventories of skills not fully used, but needed to ensure adequate levels of more senior or more specialized skills which are used. If costs of a weapon system are determined by the manpower utilized directly, without accounting for the pipeline inventories, then the tendency will be to under-cost and over-invest in such systems. This will result in inability to man them adequately. ♦As an example, the U.S. Navy's guidelines for economic analysis as stated in Secretary of the Navy Instruc- tion 7000.14 of 14 March 1973 specify that personnel costs are charged according to ". . . the cost of military per- .onnel services involved directly in the work performed." 163 164 R. H. CLARK AND R. A. COMERFORD Unaccounted for "pipeline" personnel costs apply not only to manpower waiting to be de- veloped into more useful skills, but also to manpower which has transitted from a junior but valuable skill to a more senior but overmanned skill. Retired personnel are an obvious example, but other senior active skills may be less real worth (in terms of opportunity costs) than a more junior grade. The problem of improper weapon accumulation is compounded if improper dis- counting techniques make the present value of future pension costs less than they should be.* The result is to obtain systems more manpower intensive than is efficient. An alternative approach avoiding the pipeline error is needed. Manpower supply must be properly accounted for in allocating Defense resources. In the proposed model, decision making in peacetime is the assumed scenario, though extensions to wartime are straightforward. The basic question will be what mix of weapon systems to accumulate over a long term planning horizon. The present value of the selected weapon configuration serves as the objective. The model accounts for all personnel types, including those such as recruits, trainees, and retirees, who are not specifi- cally productive. It also can account for all budget allocations within the annual budget. These allocations are called subbudgets, and parallel actual Defense expenditures. The model reflects the following assumptions: 1) Defense budgets are obligated annually. They are based on national macro-economic policy (which, in turn, reflects national security policy), and are functions principally of GNP and the unemployment rate. Since they are determined by internal factors, at least in peacetime, defense budgets are reasonably predictable. 2) Defense budgets are fixed for the year. Defense cannot borrow or lend funds to increase or diminish the annual budget. 3) While the social cost of capital is basic to the decision society makes when it decides how much to allocate to Defense, it is not equivalent to the cost of capital which Defense should base its decisions on once it has received the funds. That defense investment and the social cost of capital are not closely related has been upheld empirically. During periods of high world tension, defense expenditures have reflected imputed discount rates in excess of 20% while the social cost of capital (as measured by the treasury bond rate) remained at 3% during the same period. t 4) The military labor market is very imperfect. Capital is not a surrogate for manpower in the short run. Dollars cannot be converted into manpower for two reasons. First, military personnel are hired only as recruits, with needed skills then developed. Second, all equal grades are paid essentially the same wage regardless of relative worth. (The existence of specialty pay counters but does not neutralize this point.) 5) Due to the unforseeable tumultuous effects of sudden changes in Defense posture, changes in Defense policy must occur gradually to allow the economy to adjust. Also, risk considerations, such as maintaining some minimum defense posture, must be accommodated — and initial conditions of the analysis must match current defense configurations. ill III ♦Department of Defense Instruction 7043.3 of 18 October 1972, "Economic Analysis and Program Evaluation for Resource Management" specifies a 10% discount rate for all defense investment decisions. The proper rate for discounting social investments is discussed from a conceptual standpoint in Arrow and Kurz [1]. But as will be seen from the model below, an even more serious error than an improper rate is to disregard future pension costs altogether. This practice results from using limited planning horizons, and since pension costs are among the last to occur, causes manpower costs to be biased downward. tFor a discussion of discount rates in excess of 20% see Hitch and McKean [7, p. 211]. k MANPOWER planning/capital BUDGETING 165 Rationale for the Manpower Planning/Capital Budgeting (MAPCAB) Model Model development is based on the above assumptions. Collectively these assumptions make the Defense situation unique. However, some capital budgeting techniques borrowed from business decision theory, combined with familiar optimization methodology, can clarify the Defense re- source allocation problem. MAPCAB combines some standard manpower planning techniques with capital budgeting criteria in a mathematical program to yield solutions. An iterative methodology used in the model makes the underlying interrelations between manpower and capital allocation visible to the decision maker. This facilitates coordination between the policy maker and the analyst when the model is parametrically exercised. The model has the following features: 1) Superficially, the proposed technique for discounting is radically different from the current cost of capital orientation. The justification for the discount rate concept is, as usual, the relative advantage of current consumption over future consumption. But two questions arise. First, if Defense is to discount, what is the equivalent of "consumable" for defense? Second, what is the value of current consumables relative to future ones? Defense exists to deter, defend, or attack enemies. It can do that only with employable weapon units like naval task groups, ballistic missile submarines, bomber wings, missile defense units, etc. Resources can either be expended to have such weapon units now, or they can be directed toward building shipyards, conducting research and development, and building training facilities so that more weapon units will be available later. Viewed in this light, the consumables are employable weapon units. The worth of current relative to future weapons is obviously dependent on the tension level now versus the future. Consequently the objective function consists of the cumulative present value of deploy able weapon systems, discounted as a function of the tension level. The tension level is measured as the probability of engagement in hostile activity during each year of the planning horizon. (The probabilities could be obtained from a panel of intelligence experts through Delphi techniques.*) The objective function variables can be adjusted for tactical and/or physical depreciation, reliability, and relative or absolute payoff. 2) The objective is maximized subject to budget and manpower constraints. Budgets and manpower limitations are estimated for a multiperiod planning horizon. Since future manpower supply is partially controlled by certain budget allocations (e.g. recruiting and training budgets), alterations to such budgets are made and new manpower estimates obtained. Changing manpower levels in turn affects budget allocations. In essence an entire new set of constraints is obtained. The new optimum objective is compared with the original solution. Through a process of such iterations, manpower supply affects weapon selection and weapon selection affects manpower supply, until a solution reasonably near an optimum obtains. 3) Large shifts in defense posture could cause economic tidal waves in the private sector. Such iflfects are prevented in the model by constraints which restrict year-to-year changes in weapon evels and/or budget allocations to a predetermined percentage of the earlier year's level. Addition- lUy, risk is included as a constraint which places lower bounds on selected variables. 4) Finally, the model assumes divisibility in the variables. While one could argue for integral /eapon units, most weapons are either sufficiently numerous to negate the value of an integer olution (e.g. FBM Submarines), or, in fact, are divisible (task groups). *For a description of Delphi see Quade, "When Quantitative Models are Inadequate" [9, pp 333-343]. 166 R. H. CLARK AND R. A. COMERFORD The MAPCAB model as presented in the next section incorporates the above factors in a Unear program. It uses opportunity costs derived from the optimization process to make the budget allocation decisions which alter future manpower supply. The procedure compares normalized opportunity costs of the various scarce factors, and then heuristically alters manpower related budgets. Perceived shortages in manpower relative to capital are reduced by deducting funds from non-personnel related budgets and applying them to recruiting more people, or raising wages to retain people. Perceived shortages in skilled relative to unskilled manpower are reduced by shifting recruiting funds to training budgets. Budget reallocation continues so long as improvements in the objective function result. Eventually the opportunity costs will yield no further improvement within the budget flexibility allowed. The model is intended to assist in long term policy decisions. Directions rather than exact magnitudes of movements are perhaps the most relevant output of a policy model. The type of policy questions which the model can answer include the following: Should manpower levels be increased or decreased? Should a particular system be expanded or phased out? What skills will be in the shortest supply if a certain mix is accumulated? This initial model is entirely deterministic. Stochastic extensions accounting for distributions of annual budgets and manpower transitions are proper for follow on studies. II. MODEL DETAILS A. Notation A Vector A. A Matrix A. dkit Amount of budget k needed per unit weapon j year t. B Average value of budget dual variables for planning horizon T. (See equation (D-3).) Bkt Budget type k available in year t. Brt Recruiting budget for year t. B,t Training budget for j^ear t. B*u Budget Bic adjusted to reflect relative worth of money to manpower duals. (See equations D-4, D-6 and D-7).) B**i, Budget Bit adjusted to reflect relative worths of money to manpower duals and relative worths of specialized to unspecialized skills. (See equations (D-12 and D-13).) Bt Total budget allocated to Defense in year t. Crjet Accounting cost of skill code rgc in year t. Includes all personnel related costs such, as wages, incentives, housing, medical, and administrative costs. Creu. Tji The accounting cost of converting a person of skill code rgu into skill code rgs. Di The discount rate used in discounting the objective for year i. This rate is based on the world tension level for year i rather than on the cost of capital. drgct Dual variable associated with the constraint associated with the index rgct . d'rgct The duel variable d adjusted to dollar units. (See equation (D-14).) d"rgc^i The adjusted dual d' adjusted a second time for the training cost of developing skil code rgc. (See equation (D-15).) MANPOWER planning/capital budgetdstg 167 The average value of all duals associated with manpower constraints averaged over the entire planning horizon T. (Equation (D-2).) The number of men of skill code rgUj which can be converted to code rgS j if the recom- mended budget amount is added to the current budget for such conversions. (See equation (D-20).) The number of recruits obtained in year t. Number of retired persons living in year t. Number of persons deceasing in year t. The vector of manpower levels existing at time 0. The number of men of skill code rgc required to man a unit of weapon type j in year t. The supply of men of type rgc available in year t. The vector of all manpower skill codes available in year t. Total number of dual variables associated with constraints indexed hy manpower type rgi for the entire planning horizon T. If there were n different rgi type skills in each year t, then Ni=nT. The relative payoff or utility of weapon type j in year t. P reflects reliability, depreci- ation, obsolescence, and mission weighting, and is discussed in formulas (B-1) and (B-4). The index indicating manpower type rate/grade/code rgc. The underlining of an index does not imply vector notation ; rather it indicates a single index. The triple designation is used to facilitate implementation. A multiplier obtained through the heuristic defined in formula (D-5) , which denotes that portion of the subbudget B*rt to deduct from operating budget B^t when adjusting budgets to provide more manpower. Equations (D-4) and (D-6) pertain. Analogous to i?*,- but designating the amount of increase required in budget B^t to reduce manpower supply. Equations (D-7) and (D-8) pertain. The average value of all duals associated with specialized manpower constraints averaged over the entire planning horizon T. (See equation (D-9).) The average of all duals associated with specialized manpower constraints averaged over year t. Equation (D-11) pertains. The fraction of the reallocated recruiting budget which should be applied to develop- ing specialty r^j in year t. (See formulas/(D-16), (D-17), and (D-20).) The planning horizon, ie. t=Q, 1, 2, . . ., T. The transition matrix for year t characterized in C-1. The readjusted transition rate reflecting the change in T^ necessitated by the budget reallocations which effect manpower supply. (See equations (D-21) and D-22).) The heuristic denoting the fraction of the budget reallocation to develop new special- ized skills rgS j which should be used to retrain specifically skill rguj. (See equations (D-19) and (D-20).) 168 R. H. CLARK AND R. A. COMERFORD U The average value of all duals associated with unspecialized manpower constraints averaged over the entire planning horizon T. (See equation (D-10).) Ut The average of all duals associated with unspecialized manpower constraints averaged over year t. Equation (D-11) pertains. Xjt The amount of type j weapon units to be deployable in year t. Yrt The amount by which the recruiting budget B*rt should be shifted to or from training budget Bst in year t. (See equations (D-11), (D-12) and (D-13).) B. Optimization Phase Objective Function The objective function is assumed linear. Extensions to include diminishing marginal returns are appropriate but not considered here. Let (x,,:i=l,2, . . .,J-t=\,2, . . ., T) represent the set of deployable weapon systems selected for the planning horizon T, where x^ repre- sents the number of units of weapon type^ available in year t. Assume there are known factors Tj,, qjt, and Sjt representing reliability, physical depreciation, and tactical depreciation, respectively. These three factors are included in this discussion because reliability, physical depreciation, and tactical depreciation are all parameters characterizing a weapon unit. A brief digression to discuss how values for each can be obtained is therefore appropriate. Reliability Tjt reflects, essentially, the "up time" of a weapon unit. If (at time t) it requires four aircraft in the inventory to keep one operational, then Vji would be .25. Such data is available for most systems. Physical depreciation q^t represents system decay in the form of parts support. As systems age, repair and support costs tend to increase; thus q,, would normally be decreasing in time. Again, data for such depreciation should be available from logistic records. A depreciation factor similar to accounting depreciation could be used to obtain values of qjt for some systems if data is lacking. Tactical depreciation is a function of the potential adversary's systems. Values for Sjt would! have to be based on historical estimates of system tactical life. An estimate of tactical decay for model purposes might, if data is lacking, arbitrarily be based on exponential decay, with the decay! rate being a function of the relative lifetime of different systems — e.g., aircraft being useful foij 10 years, ships for 30, radars for 6, etc. Assume now that there exist known measures of relative utility (in the foreseen scenarios) for the different weapon types. Denote these pj,. These are of more theoretical impact to the model than the above three factors. The pjt are the required output measures of the differen weapon units during the different time frames — PjiXjt would represent the raw output of the jt] system in period t. How does one obtain the pn? Ideally, through a Delphi technique or othe] opinion sampling process, one attains them from military experts. As an initial estimate, however, one could let pjt equal the unit cost of a system divided by the system's expected lifetime. Thisi method implies obvious assumptions, one being that the cost of a system is a measure of its relative; worth. This may not always be true, although it should be valid when a system is first produced. Nonetheless, this is an adequate starting point, one that lends itself to discussion by militaryj planners, and therefore to subsequent adjustments to more accurate values of pj,. I MANTOWER PLANNING/CAPITAL BUDGETING 169 Of course, the aim is to obtain objective function coefficients reflecting the relative utilities of different systems at different times, adjusted for reliability, and for physical and tactical deprecia- tion. Let these coefficients be denoted Pjf Then We will assume f takes the following specific form : (B-1) Pit=PH'rn(LjtSu The unadjusted effectiveness of all the weapon units available in year t, that is total defense sj'^stem effectiveness in year t, then becomes We would like to convert each year's defense capability to a "present value." If there exists some time preference for defense capabilities, then this can be done by an adjustment similar to time discounting. In fact, if D, represents the discount rate in year t, then the present value of year t's output becomes (B-2) 1 n (i+D,) ^ 2-iPjt^jt The Vt should reflect the relative needs for defense systems at different times. This translates into having high values for Z), during high world tension levels, and low values during low tension I levels. State Department and Defense Department planners might be able to predict relative ) tension levels- — certainly in the past the buildups to wars have been quite obvious, and estimates ion the return to peacetime have been made. The Z)('s ideally would reflect these estimates. If 1 agreement is not possible, then a constant rate for Dt, such as 5%, could be used. This is similar to procedures \n financial discounting, where some constant rate (e.g., 10%) is used when agreement on the time structure of interest rates is not possible. The overall present value of the objective function then is the present value of the set of selected systems x^,. This is written: ,(B-3) Present Value= y^, Z-iPu^u n (i+D,) ^ If mission versatility of the various systems is a factor, let Smj be a measure of the effectiveness of weapon type j in mission m and let w^ be the weight specifying the relative importance of mission m in the overall defense scenario. The objective function than becomes B^) n {i+Dd "• •'■ 1=1 lereafter expression (B-3) will be used to refer to the objective; maximization is understood. 170 R. H. CLARK AND R. A. COMERFORD Constraints 1) Budget constraints: Let a^n be the amount of budget type k needed to support one unit of weapon type j in year t. And let 5*/ be the allocation of funds to subbudget k in year t. The Btt) k=\, 2, . . ., T, include only those subbudgets not directly dependent on manpower levels — (examples of included budgets are weapons procurement, support equipment, non-housing con- struction, and central supply. Excluded budgets are wages and incentives, personnel administrative support, medical, and housing.) The budget constraints are the following for the non-personnel related budgets: Budget Constraints Jor Year t Budget 1: aii,xu+ . . . +ai;,X;r-|- . . . -\-autXjt<Bu (B-5) Budget k: an,xu+ • ■ ■ +atjtXj,+ ■ • • +0'kJtXjt<Bi, Budget K: aKuXu+ ■ • ■ +aKjtX3t+ ■ ■ ■ +a'Kj,Xj,<BK, There will be one such set of constraints for each year t. The personnel related budgets excluded from (B-5) enter the analysis through considerations of manpower supply below. Note the sum of the B^t for each year t must be less than or equal to the annual budget Bt less the sum of all per- sonnel budgets. 2) Manpower constraints: If nirgcjt is the amount of men of skill code rgc (representing a rate/ grade/specialty-code combination) required to man and support one unit of weapon type j in year t, and Mrgct is the estimated manpower of type rgc supplied in year t, then for year t the manpower constraints will be the following : Manpower Constraints Jor Year t innutXu+ ■ • • +wiiii>«aJ>«+ • • • -\-niinjtXjt<Miut (B-6) m TgCl tXu-\- -\-m TgCjf' t-\- . . . -\-mrgcJtXjt<Mr, f^J RGC ltXxi'T' ■ ■ ■ ~r ^ RGC jtXjt ~r • • • ~r1^RGCJtXjtS:-M.RGCt Again, one such set of constraints applies for each year t. 3) Risk Constraints : For reasons of intuition, politics, or stability, constraints placing either lower limits on the weapon units for any year or upper and lower bounds on certain ratios of the variables j may be included. Several types of such constraints are: j a) Lower Bound Constraints. These reflect the concern of military planners to allow dips in effectiveness to occur below some minimum readiness level. The requirement that total eflfective-i ness in year ^+1 be at least R times that in year t, where R is a constant near 1.0, would be met byj the following constraint: (B-7) S Pi. .+ix,. .+1 >Ri: Pu^s, j j k MANPOWER planning/capital BUDGETING 171 If R' is a positive constant less than 1.0, the following constraint ensures that system _7 not be phased out too suddenly: (B-8) P,,,+,Xu,+r>R'PuX,t. b) Upper and Lower Bounded Ratios. Military planners may feel that the relative ratios of different systems be kept within certain limits. This form of risk constraint can be met by the following condition: (B-9) L<^<U,L,U constants. Xit 4) Technological Constraints: It may be that factors besides budget limitations prevent the build up of certain systems. The following constraints are alternative ways of restricting the ac- cumulation rate: (B-10) ^^^-^<R;R>1, Xjt (B-11) x,,+^<x,,+j*;j*>0. 5) Initial Conditions : The starting point of the analysis requires that planners begin from cur- rent weapon configuration a;io, . . ., Xjo, • • •, Xjo, the weapon mix effective at i=0. This can be accomplished either by specifying the first period variables or allowing some flexibility in their values by constraints of the type (B-12) aa;;o<a;;,<6a;;o; 0<a<l<6 Such requirements have certain real-world value. First, they are realistic, as resources are seldom flexible enough to achieve large short term changes. Second, the starting point of the analysis coincides with the situation faced by decision makers, who must decide on future courses of action given the current status. Analytic recommendations and decision alternatives are more likely to be compatible, and the gap between decision maker and analyst is narrowed. Finally, by starting at the current mix and restricting movements to relatively small but realistic magnitudes, the assump- tion of linearity in the model is less restrictive. C. Manpower Supply Phase Manpower Transition Matrix 11) Let Mjjct be the vector of all manpower skill codes rgc for all years t. Then M,jct represents the right hand side of the manpower constraints (B-6) above, augmented with those manpower codes which are not used in any of the weapon systems in the weapon mix selected — such as recruits, trainees in training, and retired personnel. The manpower supply phase of MAPCAB shows how Mjjct is obtained and how it is modified through changes in recruiting budgets, training budgets, etc., to provide manpower supply most advantageous to planners within the flexibility allowed. The normal approach to estimating M is through a process involving a Markov transition matrix. * *Such an approach is desciibed by Forbes [5, pp. 93-113]. 172 R. H. CLARK AND R. A. COMERFORD Since in MAPCAB the transition rates are altered through budget manipulation and since the initial "state vector" will be non-stochastic, the matrix will be referred to simply as the (deter- ministic) transition rate matrix Tf To isolate the skill codes rgrc which are actually used to man systems from those manpower codes which are necessary but not used, the following additional notation is introduced. Let -Moo( = Number of recruits obtained in year t •^po«= Number of retired people in system in year t -^90f= Number leaving (quitting or deceasing) year t T,gc. rgc'« = transition rate from rgc to rgc' in t. The transition matrix Tt can then be represented : Mroc-i Mm-i (C-1) M, TgCt-\ M pOt-l M, pOt M, gOt J^ 00, rgc . . ■ . . . T , ... ■'■ rgc. rgc' -^ rgc. pO • I££..«0 ... • • • J- pO. pO -^ pO. £0 There will be one such matrix for each year t, with different values for the transitions likely. Given the matrices T„ the initial manpower supply vector Mrgco, and the scalar recruiting quantities Moot ior t = 0, 1, . . ., T—1, the manpower supply vector Mrgci for any year t can be determined using the following formula, which is derived in the appendix: (C-2) Mrgct=Il TMgcO+T,^ T,+,Mc i=l = j=l i=j - OOj Subsequently the transitions in Tt will be discussed in more detail to show how each transition is obtained. It will be found that some transitions are the result of personal preferences on the part of the personnel, while others are the direct result of funds made available for specific training. For the former, planners must rely on historical data and estimated relationships between civilian/ military advantages, while the latter are more directly and accurately obtained. It should be pointed out that the T<, i>l, are specified once given the initial manpower vector Mrgc o, the vector of recruiting budgets over time, the vector of training budgets over time, and the initial transition matrix Tq. For, given the preferences of the individuals concerned and the known quantities of recruits, trainees, etc., each year, the fact that the matrix rows sum to one determine, under certain realistic assumptions, the remaining Trjc.rjc, (i.e., those besides the transitions to skills, assuming fixed proportions of transitions to voluntary positions) . 1 MANPOWER planning/capital BUDGETING 173 D. The Optimization/Manpower Supply Relationship The optimization process described above yields primal and dual solutions under proper conditions. The proposed improvement to the allocation process involves a heuristic based on the dual variables (opportunity costs) associated with the primal budget and manpower constraints. To describe the heuristic, further definitions and notation are needed. Let the subscripts u and s denote unspecialized and specialized skills. Such specialized training is acquired through formal training programs, such as schooling, at significant cost. A person of rate/grade rg who has received formal training of type s would then be classified by skill code rgs, while the same rate grade without specialty training would be designated skill code rgu. This dichotimization of the previous subscript rgc is necessary to allow taking the training budgets properly into account. The transitions to unspecialized skills are caused by such relatively uncon- trollable factors as time in rate, personal desires of individuals, and individual motivation. The transitions to specialty skills, however, are mainly a function of training funds available, and it is here that the decision makers have some real flexibility in manpower planning, for budget alloca- tions can be altered to provide increased or decreased transition rates to the various specialty skills. The general approach of the heuristic in taking this flexibility into account is the following. First, an overall measure of manpower value relative to capital value is obtained by an aggregate comparison of the opportunity costs associate with manpower versus budget constraints. A deter- mination is then made to either increase funds allocated to recruiting or to decrease them — the former if manpower duals exceed budget duals indicating a greater need for men than money. Then an aggregate comparison is made between the specialized and unspecialized skill codes, again using the opportunity costs as the comparison factor, and funds are then shifted from recruiting to training, or vice versa, depending on the relative values of the skills. Finally a determination is made as to which specialty skills should be increased or decreased, again basing the decision on the relative opportunity costs. These steps are now described in more detail. Details of the Heuristic 1) Redistributing operating and recruiting budgets: All budgets included in the optimization phase will be referred to as operating budgets for convenience. Let the accounting cost of skill rgu be c^g^ and for rgs be c^^. These annual costs include wages and incentive pay, plus prorated medical, housing, and administrative costs relatable to personnel. Let drgut and drgst be the dual variables, or opportunity costs, associated with skill codes rgu and rgs in period t. Then CD-I") fl' ^"'rgU . Crgit is the opportunity cost per man converted to opportunity cost per dollar, and is then in the same units, and therefore directly comparable with, the opportunity costs associated with budget con- straints. Define the aggregate manpower related opportunity cost to be the average of all such Converted manpower duals for all periods t, i.e. :D-2) M=^ ^ AT _IA7 ■ 2—ll 2^ (^ rgut'TZ-l ^ rgsi\> iV„-|-iVj t \rju TB$ / 174 R. H. CLARK AND R. A. COMERFORD where the symbol means the summation is to be taken over all skill codes rgi and Nt is the number of rgi skill codes that exist during the time span T, i.e. the number of rgi type duals. Similarly, for all budgets 5*,, define (D-3) B==~^^Y:.d,t. Intuitively, if M/B is greater than 1, then it would seem efficient to shift funds from operating budgets into manpower. Manpower can be increased by increasing recruiting rates or decreasing resignation rates. The former may be accomplished by increasing recruiting efforts and/or making service life more attractive. The latter may result from pay raises or other personnel incentives, such as increased medical and retirement benefits. For expository reasons, only the effects of alternate recruiting budgets are detailed here. Let =^l+y,;x>—l. Jo The quantity a; is a measure of the amount of redistribution proposed. The heuristic to be used is to increase recruiting budgets Brt by an amount {xjzjBru with z an appropriate number limiting the percentage change in the budget to some reasonable amount. In order to keep the annual budget Bt unchanged, some or all of the B^t must be decreased. Specifically, for all t, increase Bn to B,*, where (D-4) B*r,^Br,+ {xlz)Brt. To maintain the annual budget at the original level, those budgets currently least binding should be decreased, "least binding" meaning having the lowest opportunity costs. To do this, define (D-5) i?*,=— ; k 2j <^*(| I k J Note that k and that for small c?*,, R^t is large. Therefore the desired redistribution conditions are met by the heuristic (D-6) B%=B,-R,,B\. This last expression is exemplary of the heuristic used whenever a set of budgets is to be collectively decreased by some known amount. Were it the case that a set of budgets was to be increased, the expression would be (D-7) B\=B,a+R\tB*'u, MANPOWER PLANNING/CAPrrAL BUDGETING 175 where (D-8) and B*'u is the amount the budgets were to be increased, the subscript i vice r indicating that it need not be the recruiting budget which is used as the base. 2) Redistributing recruiting and training budgets rf In this subsection the details of subbudget allocation to balance speciaUzed and unspeciaUzed manpower opportunity costs are presented. The case where specialized manpower is relatively more valuable is treated here. The opposite case is straightforward. Let (D-9) s=^j:T.\i^ and (D-10) u=^^ j:\iid. S and U are average converted (to dollar unit) opportunity costs for specialized and imspecialized skill codes averaged over the entire planning horizon T. Assume S/U=l-\-y; y> — 1. If y>0, then funds should be shifted from developing unspecialized skills into specialized skills. This can be accomplished by shifting recruiting funds into training funds, especially in those years when specialized skills are most needed. To do so let St and Ut be the average of the adjusted duals for specialized and unspecialized skills for just the year t For each t, decrease B*rt by Yu, where (D-11) Yrt=^%f-(yB\) and designate the new annual recruiting subbudgets by (D-12) B*\t=B*r-Yr,. To absorb this decrease, increase each annual training budget, denoted B^t, by this same amount, that is, let (D-13) B**„=B„-\-Yrt where again J5** is the new budget. The phasing consideration mentioned in the previous footnote should not be overlooked. If the delay time required to convert skill rgu into rgs is very short, the above is valid. But if it takes, say, a year to turn a recruit into a useful labor unit, and a year to develop an unspecialized skill into a specialized one, then the budget redistributions must be made one, two, or even more years tThe current model stresses selecting budgets to alter and how much to alter them. The question of when to alter them is best handled through a systems dynamics approach. The authors are currently testing a feedback model which investigates this phasing problem. 176 R- H. CLARK AND R. A. COMERFORD prior to the year used for computing Si and Uf In the illustration which follows later, a one year delay is assumed for converting recruits to productive labor units, and zero delay is assumed for converting u skills to s skills. 3) Readjusting the manpower transition rates: This section describes how Yn is allocated toward developing the skills rgs. Two factors apply — the relative values of the specialties and the training costs associated with developing each specialty. The relative values of the specialties are reflected in the opportunity costs drgu- Let the training cost of converting skill rgu into specialized skill rgs be denoted C,g„. rgs- These costs are assumed constant over time, although the extension to allow changes over time are straightforward. They will also be assumed constant over u, since a skill rgs is developed through formal means such as schooling, and costs would be equal for all trainees. Therefore recalling the definition (D-14) d'r^i=^> Orgs now define (D-15) d" d' TgSt rgst p ^rgu, rgs Here d" is & weighted measure of the various opportunity costs, the weighting now being formed with respect to the relative accounting costs of the personnel and of the skill transiton costs. A three stage approach is followed. First it is determined how much of the Y^t should go toward training each specialty rgs in year t. Then it is determined how that training should be distributed among the rgu skills, that is which rgu codes can best afford to be converted into rgs codes. Finally the transi- tion rates Trgu. rgst are adjusted to maintain consistency in the transition matrix T,. These three phases are now discussed separately. Phase I — Distributing the Y^ among specialties: Let rgst be a specific skill code from the set of codes rgs. Then let ^ d"rgiit (D-16) ^'='tT~: i Then Si represents the fraction of F„ to allocate toward the development of specialty rgst. (Note again Si sums to unity and a large d" implies a large allocation to the manpower type it is associ- ated with). The amount allocated to training skill codes rgSi in year t is (D-17) SiYr,. Phase II — Determining which unspecialized skill codes to convert into rgst codes: The funds represented by expression (D-17) are spent in converting some r^u personnel into rgst personnel. The specific codes to convert will be those in excess supply, or having the lowest opportunity cost. Let rgu j be a specific skill code from the set rgu, where rgu is now limited to those unspecialized skill codes which can be converted to rgst. If, analogously to (D-14), (D-18) J' _dr^ drgu I MANPOWER planning/capital BUDGETING 177 then a small d' indicates a good prospect for conversion to rgst. Therefore let (D-19) and allocate Uj times the total added expenditure on obtaining new rgst skills on converting rgu j personnel into rgst personnel — that is allocate UjSiYrt toward converting rgUj into rgs^t (in addition to what was previously spent on the same conversion) . Phase III — Adjusting the transition rates: Each allocation UjSiYn can convert (D-20) ^^J^^=m,, L/rgu, rgti men of skill code rgUj into rgsi. Therefore the transition rate Ttqu^tqu for year t must be increased to reflect this additional conversion, and the remaining transitions from rguj into other skill codes (specialized or unspecialized) must be adjusted to ensure the sum of the rows of Ti equal unity. Since it can be assumed that the rgUj personnel being converted would otherwise have transitted to other unspecialized skills (in fact, most would have stayed rgUj or advanced a grade to r, ^+1, Uj) the entire adjustment would be absorbed if the following facts are utilized. First the new transition rate Trguirgsj must be the ratio of the total number of rguj to rgSi conversions performed in year t divided by the total manpower of type rgu , available at the end of year t—l. Therefore the new transition rate is v-L'"^!) J- rgujrgtit^^^ J- TgUjTgiit rguirgtii i t,^ IVIrguit-i The new transition rate from rgu, to rgu^ , where k indexes the unspecialized skills which can be converted into rgs j, is not usually controllable, (since persons choose to become specialized, rather than are forced to do so), and the most reasonable assumption is that the conversions come at the cost of rgujc skills in the same ratios as the previous conversions to such skills. More specifically, (D-22) r* —T \-^ "^^' ' ^ TgUirju,l-lrgu,rjud-\-^T^^ Mrgu^uA' where, for convenience, E. The Iterative Process ■L ik — -irguirgukt- The heuristics developed in the last section allow obtaining new transition matrices T*, which, using the time zero manpower supply vector Mrgco and the new estimated numbers of recruits for each year t, yield new right hand sides for the optimization phase, f These new right hand sides t Equation (C-2) yields the new manpower levels. 178 R. H. CLARK AND R. A. CX)MERFORD should jdeld an improved solution without violating the annual budget limitations. If, on rerunning the optimization phase with these new constraints the solution becomes infeasible, the iteration should be repeated with each reallocation indicated by the heuristics decreased by a fraction, such as halved. If the solution results io a decrease in the objective, which may happen if the realloca- tions are so large that the optimum is passed, the reallocations should also be halved. If the steps are small enough in each iteration, the objective should show small improvements each iteration, until no further significant improvement is obtainable. This point will naturally be reached if the dual variables are approximately equal, or the limitations of the problem flexibility in terms of budget changes or manpower development is reached. Undesired occurrences, such as cycling or as3Tiiptotic behavior without significant improvement, should be easily analyzed due to the se- quential natxire of operations. This would not be the case if the problem was converted to a one phase optimization program. The iterative scheme used here is a modification of the decomposition technique formulated by Kornai and Liptak in [8]. Computational forms of large scale appUcations of the above man- power planning model can benefit from the algorithm they originated. F. An Illustration Assume a hypothetical defense structure having three types of weapon units denoted A, B, and C. Also assume a four year planning horizon and a simple manpower structure consisting only of recruits, unspeciahzed skills u, and specialized skills s. The annual defense budget S, is constant over time. The annual discount rates to be used are obtained from intelligence experts (if war is irominent, Dt is high ; for distant years it could stabilize at, say, 5 percent) . Also, the relative worth of the different weapon types is estimated by experienced military planners, as discussed. Kjiown Values : • Annual budget=4,851 billion dollars. • Relative net payoffs (P„) : A: B:C= 100: 10:25 • Discount rate: Z?,= (0.10, 0.08, 0.05, 0.05). (Reflecting decreasing tension levels.) • System requirements per unit: B C Operating budget (billions) Manpower type u (thousands) Manpower type s (thousands) Cost of training skill code s : $3030 per man. Accountiag cost per man : • Planned budget allocations: 0.16 0. 0183 0.0 4.0 0.3 0.655 1.9 0.16 0.3 recruit $5000 u $8000 Year $10000 recruiting ($ billions), training ($ billions)-- 0.045 0.045 0.045 0.045 0.045 0. 045 0.045 0.045 MANPOWER planning/capital BUDGETING 179 • Recruiting cost: $1000 per recruit. • Trainees: From training budget and cost per trainee 14,850 s-type personnel are trained each year. Initially 4,500 of these come from recruit types and 10,350 come from u types. • Initial transition matrix Ti : u code s code resigned 0. 9 0. 1 0.35 0.15 0.5 0.5 0.5 recruit u code s code • Initial values: recruits=45,000 wcode =69,000 s code =34,000 weapon units: type ^=10 type 5=50 type (7=20 • Restrictions: No weapon system can be decreased by more than 10% in any year. Problems to be Considered : With the above facts two problems will be solved. In the first, manpower constraints will be included in the operating budget rather than accounted for separately. This is the normal approach, with each system being charged for the manpower required to man and support it. The second problem, using the same data but explicitly accounting for manpower constraints, is solved, and a totally different solution, resulting in a completely different allocation of resources, results. The second problem is then revised through the heuristic rationale discussed to first yield an improved solution, and then to show what happens if the reallocation is carried too far. The data for these problems are not precise, but reasonably approximate figures of actual weapons used in the U.S. Defense arsenal to show the impact of using the wrong rationale. Solutions: Using formula (B-3) the objective function to be maximized is (G-1) 90.9x^i+9.09a;i,i+22.7a;ci+84x^2+8.4a;B2+21.01xcj +80xx3+8Xfl3+20xc3+76.3xx4+7.63Xfl4+19.08xc4. The manpower supply of codes u and s, using the initially planned constant recruiting and training budgets of 45 million dollars per year are (in thousands of men) 180 R- H. CLARK AND R. A. COMERFORD (G-2) Year 1 2 3 4 u code s code __. 64.65 ... 31.85 63. 13 30. 12 61.96 29.66 60.95 29.86 PROBLEM 1 : Systems costed out with manpower implicit in operating costs. The total annual budget of 4.851 billion dollars must be adjusted by deducting the annual training budget, the annual recruiting budget, and the annual wages paid to recruits who do not man or support any system. (The same would be true of retirees, who do not exist in this problem.) The available budget then is 4.851 -0.090 -(45,000Xrecruit wage)=4.536 billion dollars. Since manpower costs are to be included in the system operating costs, the budget factors for each weapon unit must be adjusted. For example, system A costs must be increased on a per unit basis by the cost of paying the skills used by the system, 4000 men times $8000 or .032 billion for the u coded personnel used, and 1900 men times $10,000 for the s coded personnel used. The per unit cost of system A is then the original .16 plus .032 plus .019 billion dollars. The new per unit costs are system A=0.211 system B = 0.0223 system C= 0.0628 The appropriate budget constraints for years 1 through 4 thus are (G-3) 0.211a;^, + 0.0223xBr + 0.0628a;c,<4.536; <=1, ... 4 The restriction that no system be depleted by more than 10% in any year is represented by X;,>0.9a;_,,_i or 0.9x^,_i— lxj,<0; j=A, B, C t=2, 3, 4 and a;Ai>8; Xbi>45; Xci>18. The optimum solution is presented in Table 3 for comparison with subsequent solutions. The matrix representation of the problem is presented in Table 1. MANPOWER planning/capital BUDGETING Table 1. — Problem 1 181 Variables : Xa\ Xbx Xci Xa2 Xb2 XC2 Xa3 Xb3 XCi XAi Xsi XC4 Objective : Maximize 90.9 9. 09 22. 7 84. 8.4 21. 80. 8. 20. 76.3 7.63 19. 1 Subject to: 0.21 0. 023 0. 068 0.21 0. 023 0. 068 0. 21 0. 023 0. 068 0. 21 0. 023 0. 068 < < < < 4.536 4.536 4.536 4.536 .9 .9 .9 -1. .9 -1. -1. .9 .9 -1. .9 -1. -1. .9 .9 -1. -1. -1. < < < < < < < < < 1. 1. 1. > > > 9.0 45. 18. PROBLEM 2: Manpower supply explicitly accounted for.* In this case the annual budget constraints must be adjusted to reflect not only the recruiting, training, and recruit wage costs, but also for all other wages paid. Doing this yields budget con- straints of 3.7, 3.73, 3.744, and 3.75 for years 1 to 4. The explicit budget and manpower constraints are shown in Table 2, and the solution is in Table 3. The important thing to notice is that the Problem 1 solution calls for increasing system A over time and decreasing system C, while Problem 2 calls for the opposite. This disparity results because in Solution 1, since manpower and money are treated interchangeably, the systems selected are those with a low per unit operating cost (per unit of payoff). But while the solution is feasible in terms of money, note it is infeasible in terms of manpower — for in year 1, as an example, the number of s personnel to man the selected systems are 11.38(1.9) +45(0.16) + 18(0.3) = 34.222 thousand men. Yet the total available supply of such men is only 31,850. If the solution is accepted, and the systems bought, they will be undermanned, not an uncommon defense occurrence. ♦The three phases of Problem 2 are represented by the constraints of Table 2. 182 Variables : R. H. CLARK AND R. A. COMERFORD Table 2. — Problem 2 Xai ^Bl Xci XAi XBi Xci XAi XBi XC3 XAt XBi Xci Objective: Maximize 90.9 9. 09 22. 7 84. 8.4 21. 1 80. 8.0 20. 76.3 7.63 19. 1 Subject to 0, 16 . 018 . 055 4. 0. 30 0. 60 1. 9 0. 16 0. 30 < < < 0. 16 0. 018 . 055 4. 0. 30 0. 60 1. 9 0. 16 0. 30 < < < B2 0. 16 . 018 . 055 4. 0. 30 0. 60 1. 9 0. 16 0. 30 < < < B, M., 0. 16 0. 018 0. 055 4. 0. 30 0. 60 1. 9 0. 16 0. 30 < < < B4 M.., 0.9 0.9 0.9 -1. -1. -1. 0.9 0.9 0.9 -1. -1. -1. 0.9 0.9 0.9 -1. -1. -1. < < < < < < < < < 1.0 1.0 1.0 > > > 9.0 45.0 18.0 Right Hand Sides: (Constraints 1-12) ignation Prob 2 Iter 1 Iter 2 Br 3. 7 3.528 3.474 M„, 64.65 68. 14 71.6 M., 31.85 38.36 40. 4 B, 3.73 3.51 3.453 M«, 63. 13 67.87 69.66 M., 30. 12 40.38 41.34 B» 3.744 3.50 3.411 M„3 61.96 67.72 73.9 M., 29.66 4. 14 42. 1 B« 3.75 3.496 3.415 M„4 60.95 67.68 73.05 M,i 29.86 41.89 42.45 MANPOWER planning/capital. BUDGETING 183 PROBLEM 2, ITERATION 1 If the dual variables of the solution to Problem 2 are investigated, it becomes apparent that manpower is more valuable than money. Furthermore, s skills are more valuable than u skills. If, therefore, the recruiting budget is first increased by 67% each year, and 67% of this increa.se is directed toward converting more recruits to s skills, the resulting manpower levels and cost changes result in the solution shown in Table 3. Notice that the objective value has increased, but no annual budget was changed. Further refinements are possible, an obvious one being to shift some training funds into recruiting since of the manpower related duals, only the u skills are now binding Table 3. — Solutions Problem 1 Problem 2 Iter 1 Iter 2 Objective 6800 6430 6536 6450 Primal Variables Xai 11.38 9 10.71 10.38 Xbi 45 45 45 45 Xci 18 25.17 18 18 aUi 12.4 8.1 11 11.3 Xaa 40.5 40.5 46.5 40.5 Xc2 16.2 27.5 16.2 16.2 Xas 13.3 7.29 10.7 12. 14 Xb3 36.4 36.45 53.8 36.45 Xc3 14.6 33.25 14.6 14.58 Xa* 14.1 6.56 10.4 13.08 Xb* 32.8 44.29 60.7 32.8 Xc* 13. 1 34.36 13.12 13. 12 Dual Variables Constraint 1 568 568 2 Not 3 Applicable 75.7 4 333 525 5 7.67 6 70 7 3.7 500 8 7.3 9 66.7 10 230.7 302. 8 477 11 6.96 12 21.29 184 R. H. CLARK AND R. A. COMERFORD PROBLEM 2, ITERATION 2. ! If the reallocations of budgets are carried too far, the increase in the objective may be less than for smaller readjustments. If the Problem 2 adjustments are 73% instead of the 67% made in Iteration 1, then the results are as shown in Table 3. The objective is less than for Iteration 1, and note that only the budget constraints are binding, as all other duals are zero. Manpower is now less valuable than money, and funds should be shifted into operating budgets and out of recruiting/ training. III. APPENDIX Derivation of Equation (C-2) : Were it not for the scalar Mooi denoting the new recruits each year, the manpower vector re- quired for the right hand side of the manpower constraints, Mrgct, would be obtained as follows: (III-l) Mr,ct=U TMncA i=l = But this expression must be modified since the recruit-to-recruit transition rate is zero and new recruits arrive each year. That equation (C-2) holds will be shown in an inductive proof. First, for year 1, by definition Since Too.ooo the transition of recruits to recruits is zero for each year t, Mrga will have its first element equal to zero. To obtain a valid manpower supply vector in year 1 the recruits arriving in year 1 must be added to Mrg-A. Define Moot, to be the column vector and let For year 1 , And, therefore, (Moot, 0,0,. . .,0), M\gct=Mrgct+Moo,;ioT all t. M'rga = Mrgcl + Mooi, M' r,a = TiMrgcO+Mooi. fSince Mrgci — Tl Mrgco and Mrgci= TiMrgg-i, for all i from 2 to T, (7—1) is true by induction. Also, T does not include the final column of (C— 1), nor does M include the number of rcsignees since resignees are external to the system. MANPOWER planning/capital BUDGETING 185 The next year's manpower vector can now be obtained: =T,{T,Mr,co+Mm) =T,T,Mr,co+T2Moou and Proceeding in similar fashion, and It is clear by continuing that the expression for Mrgct given in (C-2) is valid for the general case. The total manpower supply vector is then Mrgct augmented by the number of recruits entering in year t. BIBLIOGRAPHY [1] Arrow, Kenneth J., and Mordecai Kurz, Public Investment, The Rate of Return, and Optimal Fiscal Policy. (Baltimore: The Johns Hopkins Press, 1970). [2] Carleton, Willard T., "Linear Programming and Capital Budgeting Models: A New Inter- pretation," The Journal of Finance, 825-833, (December, 1969). [3] Chames, A., and W. W. Cooper, Management Models and Industrial Applications of Linear Programming (New York: John Wiley & Sons, 1961). [4] Chames, A., W. W. Cooper and R. J. Niehaus, Studies in Civilian Manpower Planning (NAVSO P-3540) (Washington, D.C.: Navy Department, Office of Civilian Manpower Planning, July 1972). [5] Forbes, A. F., "Markov Chain Models for Manpower Systems," In Manpower and Manage- ment Science, pp. 93-113. Edited by D. H. Bartholomew and A. R. Smith (Lexington, Mass.: D. C. Heath & Co., 1971). [6] Hirshleifer, J., "On the Theory of Optimal Investment Decisions," Journal of Political Econ- omy, 329-352 (August, 1958). [7] Hitch, C. J., and R. N. McKean, The Economics of Defense in the Nuclear Age (Santa Monica, Calif.: The RAND Corporation, 1960). [8] Komai, J. and T. Liptak, "Two Level Planning," Econometricas, 33, 141-169 (1965). • [9] Quade, E. S., and W. Boucher, eds. Systems Analysis and Policy Planning — Applications in Defense (New York: Elsevier, 1968). [10] Weingartner, H. Martin, Mathematical Programming and the Analysis of Capital Budgeting Problems (Chicago: Markham Publishing Co., 1967). AN F APPROXIMATION FOR TWO-PARAMETER WEIBULL AND LOG- NORMAL TOLERANCE BOUNDS BASED ON POSSIBLY CENSORED DATA* Nancy R. Mann Rockwell International Thousand Oaks, California ABSTRACT An approximation suggested in Mann, Schafer and Singpurwalla [18] for obtain- ing small-sample tolerance bounds based on possibly censored two-parameter WeibuU and lognormal samples is investigated. The tolerance bounds obtained are those that effectively make most efficient use of sample data. Values based on the approximation are compared with some available exact values and shown to be in surprisingly good agreement, even in certain cases in which sample sizes are very small or censoring is extensive. Ranges over which error in the approximation is less than about 1 or 2 percent are determined. The investigation of the precision of the approximation extends results of Lawless [8], who considered large-sample maximum-likelihood estimates of parameters as the basis for approximate 95 per- cent Weibull tolerance bounds obtained by the general approach described in [18]. For Weibull (or extreme-value) data the approximation is particularly useful when sample sizes are moderately large (more than 25), but not large enough (well over 100 for severely censored data) for asymptotic normality of estimators to apply. For such cases simplified efficient linear estimates or maximum-likelihood estimates may be used to obtain the approximate tolerance bounds. For lognormal censored data, best linear unbiased estimates may be used, or any efficient unbiased estima- tors for which variances and covariances are known as functions of the square of the distribution variance. 1. INTRODUCTION In the following we consider tolerance bounds for two-parameter Weibull and lognormal distributions, or equivalently, extreme-value (also known as Gumbel) and normal distributions, respectively. In the context of reliability, when the distributions are failure-time populations, then the tolerance bounds are confidence bounds on reliable life tp, a population percentile corresponding to a specified survival probabiHty lOOR and P=l— J?. We assume that there may be type-II censoring of the data. In a life-testing context, this means that the life test applied to a sample of size n is terminated at the time of the rth failure, r^n. We order the observable variates from smallest to largest and call their logarithms Xi. „, . . . , *The research documented herein was supported by the Air Force Office of Scientific Research, AFSC, USAF under Contract No. F44620-71-C-0029. 187 188 N. R. MANN Xt. n- Thus if the data are Weibull (lognormal), the sample of unordered X's is a sample of variates from the extreme-value (normal) distribution. In either case the distribution of the X's has a location parameter ;u and a scale parameter a. If the ^'s are extreme-value variates, their density function has the form (1.1) U{x)=\ exp [-exp (^)] exp (^) If the X's are normal, their density function has the familiar form For Weibull or extreme-value data, if censoring is not extensive, one can use Monte Carlo- generated tables of Thoman, Bain and An tie [22] and Billman, An tie and Bain [2] to obtain tolerance bounds. The tables are used in conjunction with iteratively obtained maximum-likelihood estimates of distribution parameters. If sample size n is 25 or less, one can determine tolerance bounds from tables of Mann and Fertig [14], Mann, Fertig and Scheuer [16], or Mann, Schafer and Singpurwalla [17] for r=3(l)n. These tables have also been generated by Monte Carlo procedures. They require the use of best linear invariant estimates r T 2—1 ^i. r.n^i, n and > . Cj_r, n2;j, „ of the parameters y. and <t, with values of (aj, r. «} and (c,, ,. n] available in Mann [11, 12] and in [18]. Results of Engelhardt and Bain [4], Mann and Fertig [15], and Thoman, Bain and An tie [22] indicate that maximum-likelihood estimators and best linear invariant estimators give very nearly the same results for the extreme-value distribution, even for small sample sizes and rather extreme censoring. Moreover, for extreme-value data, the distributions of these two types of parameter estimators are very nearly the same. For lognormal data that are not censored, tolerance bounds can be determined by simply disregarding the order of the observations, Xi, „ . . . , a;„. „, and calculating x and s to be used with tables of the noncentral ^-distribution. The approximation to be described in Section 2 is suggested in [18] as a method for obtaining tolerance bounds under conditions of sample size or censoring for which the Monte Carlo generated tables or the noncentral ^-tables described above are not applicable. The approximation is based on efficient linear estimators such as best linear invariant (B.L.I.) estimators or linear transformations of these estimators which are best linear unbiased (B.L.U.). One can also use ma.ximum-likelihood (M.L.) estimators or efficient approximations to the optimum linear estimators, as long as informa- tion concerning variances and covariances of unbiased versions of the estimators is available. All of the B.L.I., B.L.U. and M.L. estimators contain essentially all the information in the sample that can be used in making inferences about the distribution parameters (see, for example, Lawless [7]). The approximate lower tolerance bounds described below are thus essentially the most accurate available (have, in effect, the minimum probability of falling below any percentile less than the true percentile of interest) among bounds independent of the parameters (see Lehman [14, p. 78]). i WEIBULL AND LOGNORMAL TOLERANCE BOUNDS 189 2. THE F APPROXIMATION The result that provides the basis for the F approximation is that of Pyke [20] applying to any difference of adjacent ordered observable variates (called a "spacing" by Pyke) from a continuous distribution. Pyke showed that as sample size increases, the distribution of each such difference approaches that of a weighted exponential variate, or equivalently, that of a weighted chi-square with 2 degrees of freedom. Recently van Montfort [23] observed that any spacing (2.1) Ht=Xi+i,m—Xi,,n, t=l, . . . ,n, from the extreme-value distribution, when divided by its expectation, has approximately an ex- ponential distribution with mean 1 and variance very near to 1 [thus 2Hi/E{Hi) is approximately a chi-square with 2 degrees of freedom] even for sample size n as small as 3. Also, van Montfort ob- served that for a size-n sample of extreme-value variates with n as small as 3, the co variance be- tween Hi and Hj, xt^j, is approximately zero. Pyke [20] showed that in general Hi and Hj are asymptotically independent for pi=iln and Pi^j/n fixed with increasing n. Using tables, in Sarhan and Greenberg [21], of expectations, variances and covariances of reduced order statistics from the normal distribution, one can see that the properties observed by van Montfort for the extreme- value distribution hold for Gaussian spacings for sample sizes about 6 or larger. Mann and Fertig [15] combined these results with that of Box [3], demonstrating that linear combinations of chi-squares are weighted chi-squares. They showed that, for sample size greater than about 3, any efficient unbiased linear estimator a*r,n of the extreme-value scale parameter with variance Cr. nO'^ is such that a*r, „/o- is approximately a chi-squared variate over its degrees of freedom with 2/Cr, n degrees of freedom. Here the authors used the two-moment fit of Patnaik [19] of a weighted chi-square o-*r. «=S ^i(Xi+i,„—Xi,n) with mean m=(x and variance v=Cr.m<^'^ to a chi-square {2m<T*r.„/v with 2m^lv degrees of freedom). Results of Grubbs, Coon and Pearson [6] and Fertig and Mann [5] indicate that under certain conditions this result is applicable also to efficient unbiased linear or maximum-likelihood estimators of Gaussian scale parameters. Now for either an extreme-value or normal distribution, consider an efficient unbiased linear , estimator M*r. » of m having variance ^r. no-^ and covariance -B^, „o-^ with cr*r, „. The estimators M*r, n and <T*r. „ are best linear unbiased estimators or efficient simplified approximations thereof. Form the statistic X*r.n=tJ^*r.n — {Br,„/Cr.n)<^*T.n, which can easily be seen to have covariance AB-{B/C)C] = Ovnth a*,. „. Let Xp be the lOOPth percentile of the distribution of X, with P=l-R. It is for Xp, or exp(xp), that a lower confidence bound is desired. The parameter Xp is of the form n-\-Zp(T, where if X is a normal variate, then Zp is the lOOPth percentile of a standard normal distri- bution with mean zero and variance unity. If X is an extreme-value variate, then, from (1.1), Zp=Mn[il-P)-']. If one now forms X*r, „— Xp, as suggested in [18], it can be seen that the expectation Ei of this difference is given by £'i = [(— J5,. JCr. n)—Zp]a. It can be shown that if P is sufficiently small for a specified r and n, {X*r, n—Xp)/Ei is, with high probability, a positive random variate. Hence one can combine results of Fertig and Mann [5] and Mann [12], applying to prediction intervals, with previous empirical results of [8, 18] to infer that for appropriate combinations of r, n and p, 190 N. R. MANN (2.2) F,= (Z*,„-Xp)/[cr*(-J5..„/(7,„-2p)], has an approximate /^distribution. The precision of the approximation is investigated in Section 3. The number ui of degrees of freedom for the numerator X*r, „—Xp of Fi can be obtained from Patnaik's [19] two-moment fit: (X*,, „— a;p)/m, with m=E^, is approximately a chi-squared variate over its degrees of freedom, and its number of degrees of freedom is equal to 2m'^lv, where v is (2.3) Var (X*) = (yl,„-5^/C,n)<r^ Therefore, the numbers of degrees of freedom for the approximate T^-variate Fx are (2.4) V, = 2{-B,JCr.n-Zprl{Ar.n-B'r.nlCr.n) and (2.5) ^2-2/C,n. We can now determine, from (2.2), an approximate IOO7 percent lower confidence bound for tp =exp {xp) as (2.6) exp {,^*.n-{Br.nlCr,n-{Br.nlCr.n + Zp]Fy{vu V2)]<T*.n] where Fy{vi, V2) is the lOOith percentile of an F distribution with vi and V2 degrees of freedom. If vi and V2 are not integers, which will generally be the case, then one can interpolate in tables of percentiles of the F distribution or use an approximation given by Mann and Grubbs [17]. If Zp =0 and Br, JCr, n is positive {rin is small), then clearly a lower confidence bound at level 7 for exp (m) is obtained by substituting Fi^yiv^H) for Fy{vi,v2)'^Fi^y{vi,v2) in (2.6). Using Fy{vi,V2) in (2.6) when Br, JCr. n is positive will 3'ield an upper confidence bound for exp (m) at level 7. If best linear invariant estimators ^r, n—<y*T. „/(l + Cr, „) and m^, ri = ii*—Br, n^r. n have been used to estimate n and a, then one can use (2.7) exp {'^^,^-{.{Br.„+{l+Cr.rdUpFyil'u U2)-{l~Fy{.u U2)]Br,JCr.n]}hn} to determine an approximate IOO7 percent lower confidence bound for tp. Here values of At, „, Br, „ and Cr,„ can be determined from tabulated values (see [11, 18]) of E{Ln) and E{La), the mean squared errors of ju/o- and ?/(r, respectively, and the expected cross product E{CP)—E{{yL—n) (ff— o-)]/(r^. To do this, one uses the following relationships: (2.8) Cr.n=E{Lcj)l[\-E{La)] (2.9) Br,r.=E{CP)l[\-E{La)] and (2.10) Ar.r.=E{Ln) + [E{CP)YI[\-E{La)]. For large sample sizes, simplified efficient unbiased linear estimators of /j and a (see [1], [4] and [15]) can be used in (2.6) with corresponding values of ^r. „, Br,n and C,, „ = fer. «• Also, as will be shown in Section 3, maximum-likelihood estimators of m find a can be substituted for /[ir,n and a„n in (2.7). In the latter case, values of Ar, „, Br, „ and Cr. n corresponding to linear estimators are still appropriate, where available. WEIBULL AND LOGNORMAL TOLERANCE BOUNDS 191 3. PRECISION OF THE APPROXIMATION Several examples are given in [18] showing the excellent agreement of Weibull tolerance bounds based on (2.7) with exact (Monte Carlo) tolerance bounds obtained from tables of [14, 16]. We show now some further examples, among several cases investigated in this present study applying to both Weibull and Gaussian data. And we attempt to draw some guidelines regarding the pre- cision of the approximation. 3.1. Weibull Tolerance Bounds In the tables below, values of P are those considered in [13], i.e., P=.01, .05, .10 and P—l—e~- (corresponding to Zp=0, which implies the confidence bound is for expin), the Weibull scale param- eter, or characteristic life). For each combination of values of P, r and n, the corresponding vi is I displayed. Since V2 depends only on n and r, it is exhibited only once. Exact table entries Vp^y are among those found in [14, 16], and for 7>:.5 such that P[Xpyflr, n — Vp,yffr, n]=7- More generally Vp^ is the lOOyth percentile of Vp=(/i)tr,„— Xi.)/^,. n- The approximate values of Vpy are calculated from the coeflBcient of a^, „ in (2.7). Values of Ar, n, Br, „ and Cr. n have been obtained by the linear transformations (2.8), (2.9) and i (2.10) given in Section 2. The transformations are applied to E(Ln), E(La) and E{CP), which are values proportional to expected squared errors and cross products for best linear invariant estima- tors, tabulat^.d in [11, 12, 18]. The approximation described in [17] has been used in each case for ■^t(>'i, vz) since vi and v^ are, in general, not integers. Table 1 gives a representation of 3 out of 15 combinations of sample size of n and censoring ' number r investigated in the present study. The combinations of n and r for the 15 cases were 71=5(5)20, r=5(5)n. and 7i=24, r=5(5)20, 24. Approximate values in Table 1 in error by 2 per- cent or more are bracketed. Lawless [8] used the F approximation as described in [18], with tables of Ar, „, Br. » and Cr. » applying to simplified efficient unbiased linear estimators, in conjimction with maximum-Ukelihood , estimators. In other words, he substituted maximum-likelihood estimators for simplified versions of best linear invariant estimators in (2.7). He then compared values for obtaining 95 percent tolerance bounds based on (2.7) with exact values computed for maximum-likelihood estimators using numerical integration procedures. The cases Lawless considered involved large values of n, I n=25, 40 and 60 and values of r/n ranging from .1 to .9. The values of P he used were .05 and .10. For all of the 20 cases investigated by Lawless for which r/n^.3, the difference between the ; 95 percent tolerance bounds based on the F approximation and the exact tolerance bounds based ion the numerical integration procedures is within about 1 percent. For r/n—A or .2 and P=.05 I the difference is within 3 percent. For very extreme censoring {r/'n—A) the approximation (2.7) I gives extremely poor results for P=.10, apparently because of the small size of vi for these cases. For the cases investigated in the present study, which clearly involved smaller values of sample size n (and linear, rather than maximum -likelihood estimators), the values of Vp.y for ob- taining lower confidence bounds oli 1st percentiles tended to be within 1 percent of exact (Monte Carlo) values except for samples of sizes 5 and 10. ForP=.05 and .10 there were more discrepancies 192 N. R. MANN Table 1. — Approximate (from {2.7)) and Exact lOOyth Percentiles Vp^y of Vp=(yLr,n—'^)l'^r.n jor Extreme-Value Data Approximate Values from (2.7) Exact Values from [13] P=l-e-^ . 10 .05 .01 P= =l-e-i . 10 .05 .01 71=24, r •=5, 1/2 = 8.46 \ 7\»'i 23.93 3.50 16.28 79.83 tXi-i 23.93 3.50 16.28 79.83 .02 <-6. 704> <1. 691> <2. 140> 3. 152 .02 -6.87 0.91 1.90 3. 12 .05 <-4. 469> <1. 737> <2. 279> 3.441 .05 -4.59 1.36 2. 17 3.41 . 10 -3. 090 <1. 799> <2.433> 3.755 . 10 -3. 13 1.64 2.38 3.73 .50 <-0.490> <2. 301> <3. 351> 5.575 .50 -0.47 2.45 3.40 5.59 .90 0.585 3.879 5.606 9.943 .90 0.58 3.85 5.64 10.09 .95 0.763 <4. 802> 6.808 12. 250 .95 0.76 4.50 6.72 12.23 .98 0.929 <6. 377> 8.762 15. 980 .98 0.92 5.60 8.52 15.75 71=24, ;•= = 10,;'2=20.32 \ 12.75 38.92 87.92 271.66 TV*! 12.75 38.92 87.92 271. 66 .02 <-1.720> <1. 552> 2.051 3. 126 .02 -1.67 1.47 2.01 3. 11 .05 -1.228 <1. 673> 2.221 3.408 .05 -1.20 1.63 2.20 3.39 . 10 -0. 877 1.797 2.393 3.695 . 10 -0.87 1.78 2.38 3.69 .50 <-0. 060> 2.415 3.242 5. 101 .50 -0.08 2.44 3.25 5. 10 .90 <0. 373> 3.501 4.715 7.532 .90 0.39 3.51 4.73 7.55 .95 <0. 454> 3.952 5.324 8.535 .95 0.49 3.94 5.33 8.52 .98 <0. 530> 4.575 6. 163 9.917 .98 0.60 4.51 6. 13 9.94 n=10, r= = 10,1/2=27.94 i\v\ \ 1.77 123. 14 202. 22 453. 40 \ 1.77 123. 14 202. 22 453. 40 .02 <-0. 320> <1.257> 1.757 2. -864 .02 -0.80 1.21 1.72 2.84 .05 <-0. 294> 1.441 1.985 3. 195 .05 -0.60 1.42 1.96 3. 19 . 10 <-0. 281> 1.624 2.213 3.526 . 10 -0.44 1.62 2.21 3.53 .50 <-0. 101> 2.484 3.278 5.071 .50 0.04 2.50 3.29 5.07 .90 <0. 679> 3.857 4.974 7.529 .90 0.54 3.86 4.98 7.57 .95 <1. 140> 4.390 5.633 8.484 .95 0.71 4.41 5.67 8.57 .98 <1.878> 5. 101 6.510 9.753 .98 0.92 5. 16 6.65 10.03 WEIBULL AND LOGNORMAL TOLERANCE BOUNDS 193 larger than 1 percent. For any combination of vi and V2, and 7 = . 75, .90, .95, .98, the exact and the approximate values of Vp,y in both the Lawless study and the present study tend to show excellent agreement (less than about 1 percent error) in the range (3.1) 1/2^5.5, 2j/2<i'i<18^2-50. (This is similar in spirit to ranges given by Fertig and Mann [5] and Mann [13] applying to precision of approximate prediction intervals for Gaussian and WeibuU data, respectively.) If P=l— 6"' so that Xp=ix, there is little discrepancy between the exact and approximate values of Vp^y, .02<7:<.98, for r/n^A, n^25. For rjn^^A, n^l5, a chi-square approximation discussed in [15, 18, pp. 245-248] can be used. 3.2. Gaussian Tolerance Bounds For evaluation of the F approximation for Gaussian data, tables of Locks, Alexander and Byars [10] of the noncentral ^-distribution were used. In each case considered, of course, r=n, and values of Ar, „, Br, n and Cr. n corresponding to unbiased versions of maximum-likelihood estimates were employed. We point out, however, that if alternatively, we had used values corresponding to variances and covariances of best linear unbiased estimators of n and cr (see Sarhan and Greenberg [21]), vi would not be changed at all and v^ would be changed by 1 percent or less. The noncentral i-variate can be defined by (X-y.-Zp,J)|{S|^J^ with noncentrality parameter 5= -Zp^ (and ti+Zp(T=Xp) and with degrees of freedom n-l. Here is the mean of n normal variatiates with expectation m, and S^ ±, {X,-Xy/in-l) bas expectation Cn<T, with c„ = V2/(n.— l)r(n/2)/r[(n — 1)/2]. Then since X and S are independent, one can form an approximate F-statistic as (3.2) iX-Xp)/(-ZpS/Cn) with '3. 3) vi=27izp'' and p-4) .2=2c„7(l-c„^). flien, percentiles of the distribution of the approximate i^-statistic (3.2) can be used to approximate )ercentiles of the noncentral i-distribution by multiplying the former by —Zp-\Jn/cn- In Table 2, ixact 1007th percentiles of the noncentral ^-distribution from tables of [9] for 7 = . 005, .01, .05, 10, .25, .50, .75, .90, .95, .99, and -Sp=1.0(.50)3.0 are tabulated for a specified n-l, with n-l = 10 194 N. R. MANN and 35. For each value of n— 1 a tabulation of corresponding percentiles based on the F approxi- mation (3.2), (3.3) and (3.4) is exhibited. Approximate values in error by more than two per- cent are bracketed. It can be seen from Table 2 that the approximation is excellent for 7 =.5 or more (the range! of present interest) for n^9 (or i»2^27.1) when one is concerned with 0.14th through 16th per-l centiles of normal and lognormal distributions. The tabulated results, of course, apply also to S4th through 99.86th distribution percentiles when the sign of Zp is changed. The ranges over which the error in the F approximation is about 2 percent or less are given roughly by (3.5) (3.6) (3.7) (3.8) 7=.99:i'2>:27, .4i'a+10<i'i<5.2;;2— 90 7=.95:i'i>:27, .4j/,+ 10<ri<5.9.'2— 22 7=.90:j'2^27, .4ya+10<i'i 7=.50, .75:i'2>:27, .7./2+15<i'i Table 2. — Approximate (from .7)) and Exact Percentiles of the Noncentral t-distribution with Noncentrality Parameter —ynzp Approximate Values from (2.7); n— 1 = 10, i'2=39.1 Exact Values from [9]; n-l = 10, i^=39.1 -Zp 1.00 1.50 2.00 2.50 3.00 -Zp 1. 00 1.50 2.00 2. 50 3.00| A" 22.0 49.4 87.9 137.5 198.0 yU 22.0 49.4 87.9 137. 5 198. .005 .01 .05 .10 .25 ..50 .75 .90 .95 .99 <1. 193> <1. 326> <2. 032 > <2. 418> <2. 582 > <3. 247> <4. 322 > 5.435 6.232 8.067 <2. 348> <2. 533 > <3. 472 > <3. 965> 4. 171 5. 116 6.289 7.600 8.526 <10. 628> 3.482 3.713 4.430 4.872 5. 723 6.864 8.272 9.836 10. 936 <13. 423> 4. 585 4.863 5.718 6.243 7.254 8.605 10. 267 12. 110 <13. 404> <16. 327> 5. 666 5.990 6.987 7.598 8. 772 10. 342 12. 270 14. 406 <15. 905> <19. 289> .005 .01 .05 . 10 .25 .50 .75 .90 .95 .99 0.725 0.961 1.608 1.966 2.606 3.412 4. 372 5.434 6. 193 7.968 2. 156 2. 385 3. 053 .3.443 4. 170 5. 127 6.308 7.647 8. 618 10. 916 3. 42S 3.674 4.414 4.859 5.705 6.844 8.276 9.918 11. 119 13. 979 4.618 4.892 5.729 6.240 7.223 8.562 10. 261 12. 224 13. 665 17. 109 5. 76ll 6.069 7.017 7.601 8. 73C 10. 278 12. 256 14. 550 16. 238 Approximate Values from (2.7); n— 1 = 35, vj= 138.9 Exact Values from [9]; n-l = 35, i^=138.9^ 1 -Zp 1.00 1.50 2.00 2.50 3.00 -Zp 1.00 1.50 2.00 2.50 3.00 7^1-1 72. 162.0 288.0 450. 648.0 T^"! 72.0 162.0 288.0 450.0 648. e .005 .01 .05 . 10 .25 .50 .75 .90 .95 .99 <3. 490> <3. 684 > 4.602 5.047 5. 228 6.011 6.903 7. 815 8.413 9.658 5.958 6. 206 6.934 7.356 8. 123 9. 069 10. 131 11. 201 11.898 13. 334 8.373 8.674 9. 555 10. 064 19. 984 12. 115 13. 379 14. 649 15. 473 17. 169 10. 745 11. 101 12. 142 12. 742 13. 827 15. 157 16. 642 18. 132 19. 098 21. 083 13. 088 13. 501 14. 707 1,5. 402 16. 658 18. 197 19. 914 21. 636 22. 751 25. 043 .005 .01 .05 . 10 .25 .50 .75 .90 .95 .99 ,3. 255 3.499 4. 188 4.572 5. 244 6.048 6.928 7.799 8. 364 9.528 5.873 6. 139 6.908 7.345 8. 124 9.076 10. 138 11.206 11.906 13. 362 8.368 8.671 9.554 10. 062 10. 977 12. 106 13. 377 14. 667 15. 517 17. 294 10. 794 11. 141 12. 158 12. 747 13. 812 15. 136 16. 634 18. 161 19. 169 21. 285 13. 17SJ 13. 57< 14. 73? 15. 415 16. 63i 18. 16< 19. 901 21.67< 22. 84; 25. 31: WEIBULL AND LOGNORMAL TOLERANCE BOUNDS 195 For .50<7<.90, the upper limit on ui = 2n2/ corresponds to the extreme upper or lower tail of a normal or lognormal distribution . These ranges should be applicable even when samples are censored. Thus, the approximation (2.6) can be used with best linear unbiased estimates of normal parameters to obtain lognormal tolerance bounds when vi and V2 given by (2.4) and (2.5) fall within one of the various ranges specified by (3.5) through (3.8). Also, work has begun by this author to generate constants for use in obtaining simplified linear estimates from large censored normal and lognormal samples. It should then be possible to use these in conjunction with the approximation (2.6). REFERENCES [1] Bain, L. J., "Inferences Based on Censored Sampling from the Weibull or Extreme-Value Distribution," Technometrics, 14, 693-702 (1962). [2] Billman, B. R., C. L. Antle, and L. J. Bain, "Statistical Inference from Censored Weibull Samples," Technometrics, 14, 831-840 (1972). [3] Box, G. E. P., "Some Theorems on Quadratic Forms Applied in the Study of Variance Problems, I. Effect of Inequality of Variance in the One-Wa}^ Classification," Ann. Math. Statist., 65, 290-302 (1954). [4] Englehardt, M. and L. J. Bain, "Some Complete and Censored Sampling Results for the Wei- bull or Extreme-Value Distribution," Technometrics, 15, 541-549 (1973). [5] Fertig, K. W. and N. R. Mann, "A New Approach to the Determination of Exact and Approxi- mate One-Sided Prediction Intervals for Normal and Lognormal Distribtuions, with Tables ," in Reliability and Fault-Tree Analysis, R. Barlow, Ed., SIAM Series in Applied Mathematics (1974). [6] Grubbs, F. E., H. J. Coon, and E. S. Pearson, "On the Use of Patnaik Type Approximations to the Range in Significance Tests," Biometrika, 53, 248-252 (1966). [7] Lawless, J, F., "Conditional Versus Unconditional Confidence Intervals for the Parameters of the Weibull Distribution," J. Amer. Statist. Assoc, 68, 665-669 (1973). [8] Lawless, J. F., "Construction of Tolerance Bounds for the Extreme-Value and Weibull Distri- butions," Technometrics, 17, 255-261 (1975). [9] Lehman, E. L., Testing Statistical Hypotheses (John Wiley, New York 1959). [10] Locks, M. O., M. J. Alexander and B. J. Byars, "New Tables of the Noncentral t Distribu- tion," Aerospace Research Laboratories Report ARL 63-19, Aerospace Research Labora- tories, Wright-Patterson Air Force Base, Ohio (1963). [11] Mann, N. R., "Results on Location and Scale Parameter Estimation with Application to the Extreme-Value Distribution," Aerospace Research Laboratories Report ARL 67-0023; Office of A-erospace Research, U.S. Air Force, Wright-Patterson Air Force Base, Ohio (1967). ; [12] Mann, N. R., "Tables for Obtaining the Best Linear Invariant Estimates of Parameters of the Weibull Distribution," Technometrics, 9, 629-645 (1967). J13] Mann, N. R., "Warranty Periods for Production Lots Based on Fatigue-Test Data," Engi- neering Fracture Mechanics, 8, 123-130 (1976). [14] Mann, N. R. and K. W. Fertig, "Tables for Obtaining Confidence Bounds and Tolerance Bounds Based on Best Linear Invariant Estimates of Parameters of the Extreme-Value Distribution," Technometrics, 15, 87-101 (1973). 196 N. R. MANN [15] Mann, N. R. and K. W. Fertig, "Simplified Efficient Point and Interval Estimators for Wei- bull Parameters," Technometrics, 17, 361-368 (1975). [16] Mann, N. R., K. W. Fertig, and E. M. Scheuer, "Confidence and Tolerance Bounds and a New Goodness-of-Fit Test for Two-Parameters WeibuU or Extreme-Value Distributions with Tables for Censored Samples of Size 3(1)25," Aerospace Research Laboratories Report ARL 71-0077, Office of Aerospace Research, United States Air Force, Wright-Patterson Air Force Base, Ohio (1971). [17] Mann, N. R. and F. E. Grubbs, "Simple, Efficient Closed-Form Approximations for Beta Percentiles, Exponential Prediction Intervals and Confidence Bounds on Exponential and Binominal Parameters," J. Amer. Statist. Assoc, 66, 654-661 (1974). [18] Mann, N. R., R. E. Schafer, and N. D. Singpurwalla, Methods jor the Statistical Analysis qf Reliability and Life Data (John Wiley, New York 1974). [19] Patnaik, P. B., "The Non-Central x^ and F Distributions and Their Applications," Bio- metrika, 36, 202-232 (1949). [20] Pyke, R., "Spacings," J. Royal Statist. Soc. B, 27, 395-449 (1965). [21] Sarhan, A. E. and B. G. Greenberg, Contributions to Order Statistics (John Wiley 1962). [22] Thoman, D. R., L. J. Bain, and C. E. An tie, "Inferences on the Parameters of the Weibull Distribution," Technometrics, 11, 445-460 (1969). [23] van Montfort, M. A. J., "On Testing that the Distribution of Extremes is of Type I when I Type II is the Alternative," Journal of Hydrology, 11, 421-427 (1970). A NOTE ON A CONFIDENCE INTERVAL FOR AN INTERCLASS MEAN T. Jayachandran* Naval Postgraduate School Monterey, California ABSTRACT An exact confidence interval for an interclass mean, that is, the mean of a composite sample made up several subsamples of unequal sizes Ui is presented. I 1. INTRODUCTION Suppose Xij,j=\, 2, . . ., Ui] i=l, 2, . . ., ^ is a composite sample of size that is comprised of k subsamples of sizes n<. The i*'' subsample is a random sample from a normal distribution Niauc"^) and the a/s, i=l, 2, . . ., k are assumed to be independent and identically distributed (i-i-d) as N{n, <t^). ^ is sometimes known as the interclass mean and the problem con- sidered in this note is the construction of a confidence interval for /x. Some practical uses of such in interval are discussed in a paper by Long [1]. For example, the composite sample may be measure- ments on a characteristic of the output of a factory made on different days, m would correspond bo the true mean value of the characteristic being studied. The construction of a confidence interval for y. is straightforward if all the n^ are equal. For mequal subsample sizes Long [1] obtained an approximate interval that he shows to be reasonably iccurate. The procedure proposed in this paper leads to an exact confidence interval for m in the ase of unequal n^. !. PRELIMINARIES The observations X<y,j= 1, 2, . . .,ni;i= 1,2, . . ., ^ may be assumed to satisfy the variance ;omponents model !2.1) Yi,=ai+eij,j=l,2, . . .,n;i=l,2, . . ., k ♦This research was supported by the Office of Naval Research as part of the Foundation Research Program at he Naval Postgraduate School. 197 198 T. JAYACHANDRAN where the a,'s are i-i-d. Nifia-a^), the a/s are i-i-d. N{0, cr^) and the a,'s and ey's are mutually independent. Let and _ _ Then, X<, t=l, 2, . . ., /: are independent and normally distributed If nt = n for all i the X< will constitute a random sample from a normal distribution Thus, (2.2) t J^'O^-^) has a student's ^ distribution with {k-\) degrees of freedom, and this leads to a confidence interval or ti. For the case where the rit are not all equal, Long [1] obtained an approximate confidence interval for n by treating 7" as a student's t variable. On the basis of a study of the exact distribution fof T for k=2, 3 and an examination of the moments of T for large values of k, Long [1] concludes that the t approximation is fairly reliable. He also points out that the approximation is likely to go wrong if there are wide variations in the subsample sizes n< or if cr^ is large relative to o^. 3. EXACT CONFIDENCE INTERVAL WITH UNEQUAL SUBSAMPLE SIZES For unequal subsample sizes, the Xi are not identically distributed since the variance of 2 Xi is <^a' + -' even though all of them have the same mean /x. Thus, the construction of a student's t variable^ based on Xt only, independent of the nuisance parameters <Ta^-\-(T^/n,, is not possible. To get around this difficulty let (3.1) Z,=CuXu+C,,J^t i=l, 2,...k where <niC-l\'' 6«j=l — C(j INTERCLASS MEAN CONFIDENCE ESTTERVAL 199 and min Ui Then, as shown in the appendix, min Ui If — 1 * K 1 = 1 and s/=~ i: (z,-zr K-l t = l then (3.2) j,_ B-"(Z-M.) has a student's f-distribution with ^-1 degrees of freedom. An exact confidence interval for fx is now obtainable in the usual way. 4. DISCUSSION The exact procedure proposed in Section 3 is somewhat ad hoc in nature in the sense that the first observation Xn in each subsample plays a prominent role. To avoid any systematic bias that may creep in, it would be preferable to randomly permute the observations in each of the 'fc subsamples before applying the procedure. If all the subsample sizes Ut are equal to n, the statistic j(3.2) reduces to the usual T statistic (2.2). . APPENDIX Under the assumptions of the variance components model (2.1) E{Xii)=ii j=l, 2, . . ., Ui] i=l, 2, . . .,k Var {X,,)=a'+a,' cov(x,,x.,)={^^. j^j;.^., E{Xd=u^ uid t follows md Gov (X„, X,)=<ra'+^ j=l, 2, . . ., n, 200 Therefore, if then and T. JAYACHANDRAN substitution of results in F(Z0 = Cn^T(Xa) + (7«^V(Z,)+2CaC^*2 Gov (Z^, Zj (7,i=('^l^Y/^ (?,2=l-(7n and (7=-J— minrii £:(ZO=Mandy(Z,)=(raH minn^ REFERENCES 1. Long, W. M., "Estimation Problems When a Simple Type of Heterogeneity ite Present in the Sample," Biometrika, S8, 90-101 (1951). AU.S. GOVERNMENT PRINTING OFFICE: 1977240-830/1 1-3 INFORMATION FOR CONTRIBUTORS The NAVAL RESEARCH LOGISTICS QUARTERLY is devoted to the dissemination of scientific information in logistics and will publish research and expository papers, including those in certain areas of mathematics, statistics, and economics, relevant to the over-all effort to improve the efficiency and effectiveness of logistics operations. Manuscripts and other items for publication should be sent to The Managing Editor, NAVAL RESEARCH LOGISTICS QUARTERLY, Office of Naval Research, Arlington, Va. 22217. Each manuscript which is considered to be suitable material tor the QUARTERLY is sent to one or more referees. Manuscripts submitted for publication should be typewritten, double-spaced, and the author should retain a copy. Refereeing may be expedited if an extra copy of the manuscript is submitted with the original. A short abstract (not over 400 words) should accompany each manuscript. This will appear at the head of the published paper in the QUARTERLY. There is no authorization for compensation to authors for papers which have been accepted for publication. Authors will receive 250 reprints of their published papers. Readers are invited to submit to the Managing Editor items of general interest in the field of logistics, for possible publication in the NEWS AND MEMORANDA or NOTES sections of the QUARTERLY. NAVAL RESEARCH LOGISTICS QUARTERLY MARCH 1977 VOL. 24, NO. 1 NAVSO P-1278 I CONTENTS ARTICLES A Two-Echelon Inventory Model with Purchases, Dis- positions, Shipments, Returns and Transshipments Optimal Reject Allowance with Constant Marginal Pro- duction Efficiency A Chance-Constrained Distribution Problem Elements of a Theory in Non-Convex Programming Convex and Polaroid Extensions A Cutting Plane Algorithm for the Bilinear Programming Problem The Effect of Correlated Exponential Service Times on Single Server Tandem Queues Single-Lane Bridge Serving Two-Lane Traffic Optimal Control for Multi-Servers Queueing Systems under Periodic Review Cyclical Job Sequencing on Multiple Sets of Identical Machines Johnson's Approximate Method for the 3 X n Job Shop Problem A Convex Property of an Ordered Flow Shop Sequencing Problem A Manpower Planning/Capital Budgeting Model (MAP- CAB) An F Approximation for Two-Parameter Weibull and Lognormal Tolerance Bounds Based on Possibly Censored Data A Note on a Confidence Interval for an Interclass Mean B. HOADLEY 1 D. P. HEYMAN A. BEJA 21 R. M. REESE A. C. STEDRY C. BURDET C. BURDET H. VAISH C. R. MITCHELL A. S. PAULSON C. A. BESWICK Z. ESHCOLI I. ADIRI C. C. HUANG S. L. BRUMELLE K. SAWAKI I. VERTINSKY 35 47 67 83 95 4 113 127 I H. I. STERN 137 E. P. RODRIGUEZ M. L. UTTER W. SZWARC 153 G. K. HUTCHINSON S. S. PANWALKAR 159 A. W. KHAN R. H. CLARK 163 R. A. COMERFORD N. R. MANN 187 T. JAYACHANDRAN 197 OFFICE OF NAVAL RESEARCH Arlington, Va. 22217